GPT-4 is still the KING of AGENT LLMs!

preview_player
Показать описание
GPT-4 scores amazing result in a new benchmark called AgentBench.

AgentBench is the first benchmark designed to evaluate LLM-as-Agent across a diverse spectrum of different environments. It encompasses 8 distinct environments to provide a more comprehensive evaluation of the LLMs' ability to operate as autonomous agents in various scenarios.

Links:

❤️ If you want to support the channel ❤️
Support here:
Рекомендации по теме
Комментарии
Автор

Cool. We can always use more benchmarks and leader board comparisons 😍

KevinKreger
Автор

Nice, it's good to see legitimate benchmarks starting to pop up.

inplainview
Автор

I've made a generalised agent that uses GPT 3.5 that is quite reliable

jbexta
Автор

Hi 1littlecoder!

I have a question:

I own a small business and we're working on releasing a free, open-source AI model.

Would you want to make a small video on it once it's out? Would you charge $$ for it?

Thanks for your answer! 🙂

ratiemand
Автор

Shame that they didn't compare with any of the 70b parameter llama2 based models, which would be more of a fair comparison with the closed source models.

MattGoldenberg
Автор

I want to learn all of ML and LLM but have no ideas where to start, please suggest what would be the best way to start

heisenbergwhite
Автор

bro are you a job person or a student plz let us know more about you your videos are really grt

pcrwmqb
Автор

I had done one agent these days for C# programming, regrettably even the smartest Gpt-4 is still pretty dumb

diadetediotedio
Автор

I think you are forgetting that GPT 4 is not just one LLM... Its many. Benchmarking GPT 4 vs Open Source LLMs is irrelevant.

MEATHEADBooYA