o3 and o4-mini - they’re great, but easy to over-hype

preview_player
Показать описание
Critical analysis of the two most powerful new models behind ChatGPT, o3 and o4-mini. Not just the system cards, benchmarks, and my own tests, but some you may not have seen before. Yes, they can whip up amazing front-end in a few seconds, but you always have to ask what is in their data. Either way, they prove the gains from RL are just beginning…

Chapters:
00:00 - o3 and o4-mini

Рекомендации по теме
Комментарии
Автор

i can feel the AGI with this upload frequency

codersanInHoodie
Автор

Two videos in a day, goodness gracious!

MaJetiGizzle
Автор

2033- “in the short lead-up to the intelligence explosion, one under-appreciated indicator for rate of progress was the upload frequency of the YouTube channel ‘AI Explained’“

jPup_
Автор

This channel is the only non-BS channel on YouTube. Baller dude.

JordanCrawfordSF
Автор

I know this might not be a popular opinion but, don’t stress yourself out!!

All of us are waiting here and will watch your video regardless of whether you release it now or later.
We watch your videos because of your in-depth research and explanations anyway!

SayWhat
Автор

philip is the kind of guy that beats F5 while already recording to not miss a single breaking detail

crowogenesis
Автор

Just watched your last video and thought "Wow, he missed the o3 launch for a bit... I wonder if he's going to drop another video soon".
And here it goes.

filipewnunes
Автор

What a wonderful day to have two videos in a single day, top-tier quality as always.

a.s
Автор

I was gonna go to sleep! But how can I miss another AI Explained video. Also, that was fast!

maks_st
Автор

Maybe performance on benchmarks isn't a good barometer for AGI.

If you give a human a task, and they suck at it. They are able to learn and get better.

If you give an LLM a task, and it sucks at it, you need to spend millions of dollars and months to train another version, or, finetune it (which collapses other parts of the model usually)

It seems like we have been moving the goalpost for what constitutes AGI, and that actually makes sense. We get to these points where benchmarks are saturated, and then we realize, "Wow, this is nice, but obviously, this is not AGI."

I'm doubtful with LLMs, but I'm excited to see what people come up with next.

Dom-zyqy
Автор

Best channel is this. I don't need no stinkin' AI to tell me that.

rantmarket
Автор

Thank you Phillip. Was anxious to see your take on these. Hope you had a good flight

DanBarbatti
Автор

I love the balance between Davids and your videos ❤

baumwollejr
Автор

The first time I read Lost Scrolls of Jewish Wealth, I couldn’t believe how much I had been overlooking. It’s one of those books that stays with you and makes you rethink everything.

pushpampushpam
Автор

Another day, another grounded analysis from you. Yesterday’s launch needed this perspective—cheers!

latand
Автор

The difference in tone between this video and Dave Shapiro's is a perfect representation of the different cultures in London and San Fransisco imo, and this by itself is another reason to favor Deepmind over OpenAI; one is grounded in reality, the other is comprised solely of the silver lining, without the cloud itself.

ArduousNature
Автор

Gotta keep my expectations accelerating too. Looking forward to 3 videos in one day

Loris--
Автор

No AI channel is better than this one.. I always wait for "AI Explained" before climbing on a hype train. Thank you

anta-zjbw
Автор

one little tidbit that I haven't been seen making the rounds is how GPT 4 is going to be removed at the end of the month.

people keep talking about fireship, this is my AI news channel. I trust this guy more than any of them.

brianhopson
Автор

I think tool use for o3 and o4 mini is very, very, very underrated and under hyped. this is what we want reasoning to be able to do, reasoning and using tools concurrently is closer and closer to full on multimodal reasoning and internal simulations

Words-.
join shbcf.ru