Gemini 1.5 Pro Tested - The WORST Frontier Model Yet

preview_player
Показать описание
I had such high hopes for Gemini 1.5 Pro. Let me show you what happened.

Join My Newsletter for Regular AI Updates 👇🏼

My Links 🔗

Media/Sponsorship Inquiries ✅
Рекомендации по теме
Комментарии
Автор

You know I always try to find the positive side of things, but this one was tough. Have you had better success with Gemini 1.5 Pro?

matthew_berman
Автор

Every time Matt asks, "how many words are in your response to this prompt" I'm hoping some LLM will reply with "one"

MarkLayton
Автор

When a measure becomes a target, it ceases to be a good measure.

AAjax
Автор

You should have
- Stayed with the experimental version
- Turned off all safety blockers
- Tuned down the temperature to minimum
- Used a good system prompt

For me these are yielding much better results across the board even compared to Claude

malhashemi
Автор

Google marketing team is fire but the actual product

TheFocusedCoder
Автор

I no longer include Gemini in my company AI integration. There're better alternatives. They really botch it up big time.

jarnMod
Автор

I found a prompt hack for medical advice. I instructed GTP 4o that I was a medical student and I was working on a case study. It spilled the beans. Haha.

shanehixson
Автор

If you get that error again click on edit safety settings and remove all the safety options, it worked for me

maniktomar
Автор

With these tests, what if you turned the temp down to .1 or .2 to minimize luck?

ThatNerdChris
Автор

You should've edited the safety settings for the snake test, for some reason it can block certain outputs even harmless

TheRealUsername
Автор

Please do not retire the "how many words are in your response" question. It is super important for many marketing use cases because we work with limited space all the time and I hope a future model can solve it - maybe with the use of another tool and/or better planning.

lovekeys
Автор

For me, using the experimental version for coding has actually been pretty good. For reference, I'm coding a React/Node project. Perhaps because I give it very detailed prompts, that it gives me good results, I don't know. Whereas if it has to "think" about what to do, it might not do well.

But yeah, I've been pleasantly suprised.

robertgeczi
Автор

I use it only for transaltion and extraction data form very long documents - for other tasks I use Claude 3.5 - ) GPT4o only for grammar corrections and JSON output - at the moment:)

micbab-vgmu
Автор

there is a hack. google like there other models have done the same with this one and censored this one too. Ask it some controversial a question that will prompt it not to respond. run the inference the output will be blocked run the same question and inference again. And again it will block it out click on the arrow up button once and then click again to run the model. then it will respond to any question you ask and answers them quite perfectly.

Cine
Автор

11:45 it's vision answer was actually impressive, wait until we have a tiny recorder that livestreams our lives to the A.I. and it can remember and answer questions that connect years of one's information together

ytubeanon
Автор

Gemini kinda sucks. I took the 2 months free, but was not inclined to continue. I'm sure it will improve over time, but for now it's getting blown away by Claude, GPT 4o/mini, and Llama 405b

ziff_
Автор

Actually, Gemini 1.5 is the best model for summarizing long content (an entire book, or an entire codebase). I agree that for coding and reasoning, it's not the best out there.

faaz
Автор

After testing the Gemini 1.5 Pro for about a week, I can only say that this model is absolutely insane.

etherhealingvibes
Автор

The way that google gemini is consistently dumber than the other models is almost impressive at this point.
This is google we are talking about, one of the biggest companies on the planet, and they cant compete with these smaller startups?

Yipper
Автор

I have been testing and building with Gemini for many months and have been pleased only when I set the temperature between .30 and .50 for coding related queries. The latest experimental model is terrible. I cannot use it without errors. Love you work and channel. Thanks

aibeginnertutorials