GPT4o Mini - Lightning Fast, Dirty Cheap, Insane Quality (Tested)

preview_player
Показать описание
I'm blown away by OpenAI's GPT4o Mini, which may be the most important model they've ever released.

#HPUnlocksAI #IntelCoreUltra

Join My Newsletter for Regular AI Updates 👇🏼

Need AI Consulting? 📈

My Links 🔗

Media/Sponsorship Inquiries ✅
Комментарии
Автор

Have you tried GPT4o mini? Are you impressed?

matthew_berman
Автор

looks like both GPT4o and Mini use same agent to Analyse the images, both of them cost the same to run but mini adjust token counts so it's comes out as the same price as GPT4o vision.

giorgim
Автор

My 6 months old just loves watching you, it’s the morning ritual. Bottles and AI news with dad.. She growls if I pause the video.

millerjo
Автор

I think what is really happening is that for images, GPT-4o mini is using a different model. Probably, GPT-4o explains the image to mini, and then it outputs the result. I just wish OpenAI were more transparent.

daryjoe
Автор

A new question you could add is taking a common item and asking how many would fit into another common item. For some reason most models really struggle with this, and you could switch it up really easily if they start training on what you ask. They usually don't think about packing efficiency, or they just have no idea what size things are. If you ask how many apples fit in a 5 gallon bucket to small models they'll give you outrageous answers like 12000 apples sometimes.

You can also add complexity by adding in density. So if you ask, how many 5 pound blocks of cheese would fit in a refrigerator?

The top models can usually do it pretty well but will forget details sometimes. Small models really have a hard time.

coleabbott
Автор

Looks like ur questions are being trained on.

May be try more programming questions

techikansh
Автор

They trained the mini on your vids lol

axl
Автор

"The snake can go back in it's own body." You haven't collected any food to extend the snake to make it collide into it's own body. Almost certain this is working as intended.

itztwistrl
Автор

People often MISS the point of such a release. If this model is this good, and this cheap, it means they have gotten some kind of architectural break-through in terms of efficiency. Scaling this up will be a game changer.

panzerofthelake
Автор

I'm so happy you're covering this...this is a massive value add to the industry and all I'm hearing from everyone else is "AI is slowing down".

jaysonp
Автор

I recommend replacing "glass" with "cup" or "mug" in your marble problem. It may be interpreting "glass" to be a solid object of glass.

Feynt
Автор

I did my usual story test just now with mini. Its pretty darn good, on the level of GPT4o at least, it avoids some GPT4 issues. I even think this model might have some ability to plan its responses because this is the first time ive had an LLM accurately title a story before writing it.
11:02 its DEFINITELY doing some "internal thinking" of some kind.

Yipper
Автор

The more token may mean more detail and hence more precision for images.

Автор

I think they trained it on higher precision image embeddings, hence the higher tokens, but if I'm right, it should be able to accurately understand more dense images than 4 o.

adamholter
Автор

At some point you might want to step up the game generation question with something like frogger or asteroids or space Invaders.

CrudelyMade
Автор

9:00 could this be because 4o is "overthinking" the problem?
Heck, a lot of humans make this mistake, instead of trusting the initial simple answer, they doubt themselves and say "it must be someting more to this" and overcomplicate (and cause errors) in their logic.

Baleur
Автор

I think you should ask the models your test questions multiple times like a best of 3, because I suspect gpt4o-mini beating gpt4o was just random noise and not nesseseraly representitive of the models capabilities.

epsilonray
Автор

You need to come up with more questions now that it aced all of them except for one

Axel-gnii
Автор

Amazing video as always @Matthew. I think adding a generate complex SQL query based on 5 tables schema, and a test of function calling would be amazing addition to your battery of tests

James_PET
Автор

Already tested, using it for simple tasks such as semantic router. Can't beat the price / performance ratio.

tomaszzielinski