Explaining OpenAI's o1 Reasoning Models

Показать описание

In this video I go through the details that we know about how the new OpenAI o1 models work and what makes them good for reasoning tasks, the trade offs made and the

For more tutorials on using LLMs and building agents, check out my Patreon:

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
00:20 OpenAI's o1
04:54 OpenAI o1-preview Chain of Thought
07:30 o1 Evals
10:29 Hiding the Chain of Thought
13:31 o1-preview Demo
13:33 o1-preview Demo in ChatGPT interface
19:22 o1-preview Colab Demo
23:06 Pricing
25:19 Wrap Up

Рекомендации по теме

Комментарии

Thank you Sam for continuing to do these videos, it's very helpful to get a explanation of where things are currently at with these models. When I saw this, it reminded me very much of Langchain and the approach to interpret what user is asking and based on the interpretation handing the "tasks" (things to be solved) to more specialized models.

tn

Interesting Video. I might mention, that the shown tokens on OpenAI Website are just a summary of the actual reasoning. That is, why there are so "few tokens" to see. And that is why it looks like over API, they use more tokens for reasoning than on the Website. keep on :)

SonickDBS

Thanks, Sam. I have been getting these kinds of results with hierarchical prompting (chains or flowcharts) with multiple turns and code interpreter for some time using GPT 4o mini. Of course, at a an expense of tokens.

Now, if OpenAI was able to bake all of it into one inference pass, then their approach is far superior.
But, since they are API-based, this will remain a mystery.

I think the API approach is the secret to delivering AGI in the long term, as LLMs alone can’t get us there and you cannot ask your customers to orchestrate the many processes required to get there.

el_arte

My suspicion is that this style of inference-heavy reasoning capability might actually be limited to edge deployment. This is a really expensive form of inference that IMO doesn’t match the business model of large corporations, where they generally have an attitude of “We’ll spend an extra $10 million in training if it means we can deploy a 10% smaller model”, but to an end user the equation is kind of backwards; “If I can let the model run for longer, and I get better reasoning capabilities for fewer training dollars and the same quantity of RAM use on my device, that sounds pretty good”.

I think for certain tasks we could see quite modest hardware doing very impressive performance with something like this.

novantha

What a great analysis and summary! I wonder if this is being released because of a lack of real progress on 5o and realizing that getting the 10x improvement is just not achievable without some kind of big new breakthroughs. I suspect they may have hit a wall with the kind of 'human like' reasoning and instead found these methods of doing higher quality logical reasoning. It would be great if you could do a video on what is happening with Google's project Astra and if there is an API or collab? Also, seems that in some cases it might save costs by being more efficient in getting to an answer?

WillJohnston-wgew

Couldn't wait to have your grounded explanation of this new model

GriffinBrown-tqjz

This whole process looks a lot like a routerllm, some specific models for planning and breakdown of chain of thought, train some model to sometimes disagree with previous output and just a small bunch of agents to glue everything together.
An now they just charge for tokens on all models called but provide only the final result. Which is what most users are expecting.

formigarafa

OpenAI seems to redefine "Open" with every announcement.

davidwipperfurth

thanks. Am a simple man and have simple question - is it better than the sonnet 3.5 for coding tasks?

SwapperTheFirst

enjoyed going through the new models together through your videos along with the demos

indexed

It all sounds great, but I have a doubt or a question, the fact that they are hiding the reasoning generates many questions, could this not simply be an agent system behind an API? Because with agents the same results could be achieved (this has already been done), and I also found it curious that it came out right after reflection was launched (I know it didn't turn out well) but I had a similar idea, of having an embedded chain of thought, and in view of this I could think that o1 is a model, but will it be as powerful as they say? Or will it just be an agent system? Which uses a lot of computing power, or if I am actually wrong, what is the proof that says that it is a 100% model?

Diego_UG

the headings of the steps in the thinking process might be effective marketing gimmicks

bastabey

How does it decide which chain of thought is best, if it doesn't know what the correct answer is?

karlwest

Thank you for the video - saves me reading the docs 😊

asksearchknock

Have they introduced or said that they are going to, some kind of caching same as Claude to help reduce cost tokens?

kevinehsani

These models from GPT 3.5 to o1 still stuggle with basic addition and subtraction that involves more than 20+ numbers... this is not limited to GPT, Claude struggles too.

Anselm

Thank you! Nice try... But the reality is that we don't know, apart some marketing and hype driven stuff... OpenAI is only 'Open' in name.

ClaudeCOULOMBE

People need hand holding. Until they demonstrate the capabilities of these models, no one is going to pay $60 token rates. Truly, these demonstrations of logic are so lame. The voice mode ones were more immediately interpretable.... "oh, i could use that". And then.... they ghost most of us on that feaure. And yes, only API users need this new power. And we'd be happy to at least explore it. And yet, you have to be tier 5 to use it. The people guiding their decisions must truly be some McKinsey mgmt consultant morons.

IdPreferNot

It's not thinking; AI can't think—it's processing. I like your videos, but to be honest, this one is disappointing.

xunknown

Explaining OpenAI's o1 Reasoning Models

Explaining OpenAI's o1 Reasoning Models

OpenAI's New o1 Model Explained

OpenAI’s new “deep-thinking” o1 model crushes coding benchmarks

ChatGPT o1 - In-Depth Analysis and Reaction (o1-preview)

o1 - What is Going On? Why o1 is a 3rd Paradigm of Model + 10 Things You Might Not Know

'We were right' - How to use o1-preview and o1-mini REASONING models

Open AI SHIPS: 'GPT o1' First Look! ('Strawberry' Chain of Thought Reasoning)

Building OpenAI o1

Is This GPT-5? OpenAI o1 Full Breakdown

OpenAI's New Reasoning Model, o1 Strawberry: Is This AGI? Full Breakdown

How OpenAI made o1 'think' – Here is what we think and already know about o1 reinforcement...

ChatGPT o1 is INSANE: See the Full Demo! 🍓🚀

GPT-4omni achieves o1 Causal Reasoning (Strawberry)

GPT-o1: The Best Model I've Ever Tested 🍓 I Need New Tests!

OpenAI's o1 Model: Enhanced Reasoning and Future AI Applications

OpenAI's New AI GPT-o1 STUNS The ENTIRE INDUSTRY Surprises Everyone! (STRAWBERRY RELEASED!)

NEW OpenAI GPT-o1 is Absolutely INSANE…

OpenAI Strawberry o1 X Transformers Explained ⚡️

Building OpenAI o1 (Extended Cut)

OpenAI o1 Solving Logic Puzzle ⚡️

Shocking SECRET Behind OpenAI o1 Model - Bans Anyone Who Dares Ask THIS!

OpenAI o1 w/ 3 Logic Tests & QCD Feynman Integral

OpenAI's Noam Brown, Ilge Akkaya and Hunter Lightman on o1 and Teaching LLMs to Reason Better

What Can OpenAI's New o1 Model Actually Do?