What Grok 3 Gets That o1 Pro Missed

preview_player
Показать описание
Hey everyone - dropping this early for my incredible members whose support fuels this AI deep dive! Your backing allows me to transform detailed livestream sessions into sharp, actionable content. There’s a ton happening in AI right now, and editing these sessions takes real effort. If you love getting authentic, engineer-tested AI insights early, please consider joining:

This video goes public on Sunday, but your early support truly makes a difference. 🙏

After spending years at Apple, I've learned the importance of detailed testing. In this video, I put Grok 3 head-to-head against o1 Pro, revealing insights that dramatically changed how I structure my AI workflow.

You'll discover:
👉 The crucial feature Grok 3 has that o1 Pro overlooked
👉 Why planning your data architecture transforms AI performance
👉 How Grok 3 utilized my codebase context to deliver stunning results
👉 The hidden advantage of reasoning time in AI models

🕒 Key moments:
00:00 AI worth your money
00:17 Describing app's purpose
00:49 Letting Grok 3 cook
01:52 Comparing with ChatGPT
02:32 Grok's massive plan
03:19 Grok's planning stage
03:50 Automating LLMs as junior engineer
04:33 Requesting mermaid diagram
05:10 Comparing with OpenAI
06:37 Looking at o1 Pro
07:50 Modeling data hardest part
08:46 Delivering WikiTok front end
09:30 Ranking Grok 3, o1 Pro
10:33 Grok 3 limitations
11:06 AI meets real life

Join this channel for exclusive perks:

🔔 Subscribe for weekly insights—your comments shape next week’s content. This channel bridges AI theory and real-life application. Next week's episode? Even more surprising!

Catch me live every week to explore AI tools that genuinely deliver.
Рекомендации по теме
Комментарии
Автор

Yep, 100% agreed.

Ended up canceling my openai sub to get Grok3 and Cursor Pro. Beyond worth.

clahss
Автор

Take a shot every time Ray says "cook"

SR-tijj
Автор

I agree, Grok 3 rocks! One caveat to be aware of, there's a bit of an asterisk with the 1M token context window. It's effectively limited to 128k due to computational restrictions. It's allowed to stretch that a bit if the case warrants it, but for practical purposes it's a 128k window.

BruceWayne
Автор

Grok 3 does really well at creative writing as well. I had it write a Stephen King style horror short story and it did the best fiction writing of any of the models I’ve tried. It was actually a clever well constructed story. The names were unique, and the writing wasn’t generic and I even didn’t use the thinking mode just standard.

antonkryzsko
Автор

Great video, Ray! Thanks for sharing the info! Jay

SouthbayJay_com
Автор

Really enjoyed this breakdown of Grok 3 and its capabilities! It’d be fascinating if you created a standardized questionnaire or benchmark to evaluate reasoning skills across LLMs like Grok 3, o1 Pro, and ChatGPT. With new models emerging constantly, a reasoning-focused baseline could highlight real progress—beyond just specs or hype. Maybe even tie it into your comparisons (e.g., 09:30 ranking section) or the junior engineer automation idea (03:50). What do you think?

ArConfident
Автор

Hey Ray! 0:35 What is that app you are using to a craft prompt?

alisherashekeyev
Автор

great video! what app do you use to describe the initial plan and how do you pass your project files (list?) to grok?

lowenf
Автор

How does Grok handle capitalized letters, does it normally ignore them? Does it actually have to be told to pay attention to them? For example would it recognize hidden messages in capitals like a human would?

johnnyh-pay
Автор

Nice one Ray, wonder what would you have got with adding Claude Sonnet 3.7 + Extend in the same comparison. I have been doing these comparisons myself and lately Claude is my “current” favourite. Grok felt really good but misses the mark and hallucinates a lot.

PraneyBehl
Автор

I've got a sophisticated software development project that I'm about to embark on. And I'm going to experiment with the idea of having a massive context window which can see me through the whole project. And Google's notebook LM comes to the party with that. I use other AI outputs as well, but then I can just copy and paste up into Notebook LM and just continually accumulate everything from everywhere else, including API manuals and all of that sort of stuff in Notebook LM, and hopefully it will get me all the way through the project.

TerenceKearns
Автор

Gemini 2.5 Pro Experimental (03-25) is rocking *LMArena* and *Aider, * and I believe *LiveBench* soon too.

cbgaming
Автор

After releasing GPT4.5 I appreciate more Grok 3. Ater Claude 3.7 it is my second favourite model:)

micbab-vgmu
Автор

I tried to use Grok to generate images but refused to upload a single image.

brantjustilian
Автор

Youtube is giving me way too much credit by suggesting me this video. I've been lost the entire time.

Mustash-Tony
Автор

I compared all known and popular AI models on my phone and tablet, and GROK beats them all—even just the free version.

But if y'all pay the $30/month subscription for SUPER GROK and compare it to the competition (with their paid subscriptions too), y'all will see how SUPER GROK annihilates every single one of them.

And the best part? This is just the beginning. As Elon and his team said, 'Y'all haven't even seen what Grok can really do yet.'

ThinkForge
Автор

Grok is great, I think Gemini 2.5 is a bit better.

Katsunam
Автор

Ray I love you bro not throwing shade Homie but I’ve used Perplexity and I’ve looked up so many sources on the pro version on so many websites and nobody can confirm that it’s 1 million even the fine print on the Grok website doesn’t exactly say it’s 1 million a lot of people are still tapping out at 125, 000 I hope it’s 1 million but I haven’t gotten that yet and many others haven’t. Not saying you’re wrong just saying it’s pretty ignorant of the grok team to not be very specific about the context window.

razorbackroar
Автор

Can super grok remember stuff from prevous conversation?

bandhanjha
Автор

The model is capable of 1m tokens but you may want to do more research because it's likely to not be using more than 200k or so. But it's capable. Ask grok itself about it.

quantumrift
visit shbcf.ru