Mistral Large-2 (Fully Tested) : This NEW Model Beats Llama-3.1? (405B)

preview_player
Показать описание
In this video, I'll be fully testing the Mistral Large-2 (123B) Model to check if it's really good. I'll also be trying to find out if it can really beat Llama-3.1 (405B), Claude 3.5 Sonnet, GPT-4O, DeepSeek & Qwen-2. This model is fully opensource and can be used locally for FREE. It is even better in Coding Tasks and is also really good at doing Text-To-Application, Text-To-Frontend and other things as well. I'll be testing it to find out if it can really beat other LLMs and i'll also be telling you that how you can use it.

-----
Key Takeaways:

📈 Mistral Large 2 Launch: Discover the new Mistral Large 2, a cutting-edge 123 billion parameter model, released just after Llama-3 405b.

💬 Multilingual & Coding Pro: Mistral Large 2 supports 128k context windows and over 80 coding languages, rivaling GPT-4O and Claude 3 Opus.

🔍 Performance Metrics: Mistral Large 2 sets new standards in performance and cost efficiency, showcasing competitive benchmarks against leading AI models.

🧠 Smart AI Responses: Unlike other AI models, Mistral Large 2 is designed to acknowledge when it lacks sufficient information, enhancing reliability and trust.

📊 Benchmark Controversy: Explore the benchmark data discrepancies, revealing why some AI models might manipulate numbers to appear superior.

🔓 Limited Licensing: Understand the implications of Mistral Large 2's custom license, which restricts use to research and non-commercial purposes.

🧪 Real-World Testing: Watch as we test Mistral Large 2 against 9 diverse questions, comparing its performance to industry leaders like GPT-4O and Claude 3.5 Sonnet.

-------
Timestamps:

00:00 - Introduction
00:08 - About Mistral Large-2
04:00 - Testing
07:06 - Conclusion
07:54 - Ending
Рекомендации по теме
Комментарии
Автор

Brutal but honest.
That's why we come here.
Thank you!

jackflash
Автор

Please make a video about one hour plus to focus on coding with ai in vs code, to develop app or software, and live track of building, any one agree with me

kashifsaeed
Автор

We never know how real the benchmarks are and the sad part is that most of the time you just have to trust the benchmarks, without proof... (nice vid)

MASTERDEV
Автор

LOL... dude your presentation is just... really duno how to describe that tone.... Love it keep up the good work!

HarryHardon-qf
Автор

I'm surprised it did so well with your tests. I can't even use it as a backup for claude. I just end up going to gpt 4 when I hit the claude limit

vauths
Автор

Your voice is so consistent. Wow. It‘s AI generated, isn‘t it? 😁 idea: what about adding Outtakes at the end of the video? I‘d watch it.

MeinDeutschkurs
Автор

6:15 Sus butterfly nice video, continue.

anasghgyc
Автор

Hey King, I think your testing method is one of the better ones out there. Would it be possible to publish all the results on a kingly website?

MrMoonsilver
Автор

Was there one LLM so far which got all questions right, especially that one about svg?

RealLexable
Автор

what kind of hardware we need to run mistral large 2 locally ?

HemangJoshi
Автор

How to run WebUIs like gradio on kaggle and Google cloab?

hebatullahhesham
Автор

Thanks, won't even waste 1m to try this crap

fra
Автор

Hy i need video on basis, because i don't coding and i want coding with ai

kashifsaeed
Автор

In all fairness, when I ask GPT4o the geometry question, it answered 128 in the chat app but got it right on OpenRouter.

Claude 3.5 Sonnet answered the same. 128. Both times.

Llama 3.1 405b answered 64. Both times I asked.

stonedoubt