Gemini-1.5 Pro Experiment (0801): NEW Updates to Gemini BEATS Claude & GPT-4O (Fully Tested)

preview_player
Показать описание

In this video, I'll be talking about the new Gemini-1.5 Pro Experiment (0801). This is a very new update by Google to the Gemini 1.5 Pro. This new update makes the Gemini model beat Claude-3.5-Sonnet and GPT-4O. The model is now fully on-par with SOTA models and is ranking above every other model in LMSys Arena. This model can be used on Google AI Studio for Fully FREE. It is even better in Coding Tasks and is also really good at doing Text-To-Application, Text-To-Frontend and other things as well. I'll be testing it to find out if it can really beat other LLMs and i'll also be telling you that how you can use it.

-------
Key Takeaways:

📈 Google's New Gemini 1.5 Pro Experiment Model: Discover the latest AI model from Google, surpassing Claude 3.5 Sonnet and GPT-4O in the LM SYS Arena.

🔍 Free Access to Google AI Studio: Test out Gemini 1.5 Pro Experiment Model for free and experience cutting-edge AI technology firsthand.

🤔 No Benchmarks Yet: Explore the capabilities of this experimental model yourself, as official benchmarks haven't been released.

🔜 Possible Gemini-2 Preview: This might be an early look at the Gemini-2 model, rumored to have the same 2 million output tokens as previous versions.

✅ Impressive Performance: From math problems to coding tasks, Gemini 1.5 Pro excels in various challenges, proving its robust AI capabilities.

🎨 Multimodal Abilities: Beyond text, this AI can handle images, video, and more, making it a versatile tool for content creators and developers.

💡 AI Innovation from Google: Stay ahead of the curve with one of the best models from Google, potentially surpassing competitors like Claude 3.5 Sonnet.

---------
Timestamps:

00:00 - Introduction
00:55 - About Gemini 1.5 Pro Experiment 0801
01:54 - Testing (Textual)
06:37 - Textual Question Final Results
07:02 - Multimodal Testing
08:14 - Final Conclusion
08:53 - Ending
Рекомендации по теме
Комментарии
Автор

It answers like a politician. Google upsets me with its insane policy guard. I swear Nuns built and trained it.

hope
Автор

Just tested it. I like it. I did a kids story. Lots of humour and whimsical. Absolutely brilliant. Ten times better then any other LLM and beats ChatGPT by miles.

Dystopia
Автор

Great, can you make video with Aider and this?

bestcinemaonline
Автор

Make a video making a full stack app using next.js and suparbase with aider and this❤

search-bd
Автор

use Aider with it bro please love your content wish I can support more!

hamzaIVX
Автор

I'm interested if you designed question 7 and 9 like that on purpose because they throw me off too. I know what the long and short diagonals are because I've studied them. And so for me and you, they're defined and we know what they mean. This is their name. But for someone who hasn't studied them, even if they are better at maths than us, they will struggle greatly with this question. For them, this question is more complicated. They would think along these lines: a long diagonal being from one corner to another corner which would result in the line having the greatest length possible. And then of creating a diagonal within the hexagon which would result in the shortest length. Interesting points here are the fact that the two lines don't have to be from the same starting place but also can be. Another point is that they don't have to be at a vertex/corner of the shape (if you know about short diagonal then yes they do, but if you don't know what it refers to then no, they don't). This question will take anyone a great deal of thinking and time unless they're just repeating it from before. Earlier when i said the greatest length to be the long diagonal, we could say that it doesn't have to be the longest diagonal possible, just the longer one out of the two lengths. Is it the longest diagonal of the shape or the longest diagonal out of the two. Depends on what the long diagonal is defined by (for those who haven't studied it and know what it refers to). Therefore, it's undefined and so either one is good. But it's just another point that rises up. It's a good test question though to see what the model says. Typically, the models i have tested so far are not great at reviewing or performing multiple-stage operations. Such as if a program has an issue, the models will try to apply what they know to fix the error so that there is no error. Rather than taking a second look at everything else by realising that this error might suggest a mistake elsewhere which would result in the program not achieving its objective, even if this error was to be resolved. They will often get stuck on a problem and continue to suggest the same solution or fail to provide the correct solution in those scenarios. Interesting questions, if someone asked me those two questions in person, I would walk away. Currently the models have a great advantage of speed and memory and lack intelligence. So they need a lot of guidance and prompting. Then there are the token limits that get in the way. Gemini's context window makes it really interesting. For programming for instance.

ZeerakImran
Автор

honestly doesn’t beat 3.5 sonnet with my tests

juliovac
Автор

Would love to see an aider video with this. The insane context should be really good for that.

fuba
Автор

Please please please make a video aider with this

Unifactyt
Автор

Of course we want an aider video with it

ganian
Автор

seeing it working with Aider will be a great video to watch. Big fan of your nice channel. Thank you ❤

mr.arshed
Автор

Since your benchmark is uploaded to YouTube and Google has the transcripts, I wonder if that means that this new version is trained... on your benchmark?

supercurioTube
Автор

More video with aider + Gemini-1.5 Pro Experiment (0801) Please🙏

marma
Автор

No, not No equal, no better than Claude 3.5 sonnet

ilyass-alami
Автор

Alway show me the news in A.I. Good Job

TsillALevi
Автор

I am tired of testing Gemini and seeing it fail every single time. Maybe after a few years I will test it.

haydar_kir
Автор

I don’t see a join button on your channel

stonedoubt
Автор

Has any of the tested AI succeeded on svg generation test?

alissonprimo
Автор

Let's ask it if the us government uses this kind of AI and better since 2012, and if you are an AI as well because googles alphaproof proofed that's good in math by achieving silver last week 😅

RealLexable