How OpenAI Strawberry Works ― 'ONE TEXTBOOK TO RULE THEM ALL' ― My Educated Guess

Показать описание

WRITING STUFF

Рекомендации по теме

Комментарии

I was building and finetuning chatbots back on GPT-3, before ChatGPT.

DaveShap

"Hello I'm awake and I don't wanna be because I couldn't shut my brain off". Of the many very relatable things you've said, this may be the MOST relatable.

Nex_Addo

Oh WOW!! This is what I'm doing with Sonnet to "prime it" while coding.
I'masking questions about how a feature(s) of a library work and to give me a simple example.
Then I tell it to modify my code, using that feature, and give it my requirements.
It's been working much better than just asking for my required changes.

billybob

So the right brain that knows all, the left brain that is analytical, and prefrontal cortex that controls the output

AetherXIV

Solidify your knowing with conviction;
Liquify your knowing through connection;
Find your resonance between the two

INFP-Insights

I figured our quite quickly what Strawberry is - a marketing campaign.

Where you have a solid product, you base your campaign on how solid it is. When you don't have a solid product, then you give hints and let people fill the caps themselves, avoiding any accountability for their presumptions.

dopaminefield

A friend who is an insider told me synthetic data was generated by generation of variations of real reference text data and other media.
Instead of reading once the encyclopedia they rewrite it multiple times and reformulate, and so they have a very low rate of hallucination.

NO-TALK-GuitarPlugin

0:45 the synthetic data video is private.

sparker

Thank you very much for this video! I have been trying to generate synthetic data from llama 3 and was kind of following this model but this is a much more organized way.

michaelmeglasson

Not quite. Strawberry does the following = finds questions to answer on the web (think of it like end-of-a-chapter questions in a college textbook, and many others). Then it truly answers them. The answering logic depends on some kind of tree search, trial and error, and a critique model than can check correctness very accurately. Once a true answer is found, it's recorded. And the step by step derivation to reach the true answer now becomes the synthetic next-token prediction task (pretraining data) for the LLM to train on. I believe this is how the next gen LLMs will be built -- using self-generated step by step problem solving synthetic data.

MrStarchild

I know we sometimes forget how revolutionary this is but even like 4 years ago, in 2020 this kind of a thing would feel like sci-fi - that you can ask a computer program to essentially write you chapters of textbook on physics. Yes it hallucinates etc. but looking back if you transported a person from the time Covid started and showed him/her this they'd be blown away of what is possible just in 4 years

lkrnpk

Love these "more technical" videos, they're the reason I subbed originally. I enjoy the ride when a "philosophical" one comes out, but I'm less likely to click on it.

tablen

The light bulbs are popping for us too David, thanks for being the catalyst you are! 🙏

brianhershey

Considering all the talk of Q*/Strawberry being a breakthrough in * reasoning *, and some talk of it involving inference-time compute... I suspect that while said breakthrough is being * used * to do these tasks, likely to improve the quality of Gpt-5/Orion without the need for slow inference time, I doubt this is how Strawberry was "works". While the technique you describe would likely do a lot to reduce hallucinations, I don't think it would lead to a * reasoning * breakthrough unless, maybe, they made a textbook of correct and incorrect leaps of logic, as opposed to the broad library of human information that you describe.

kylemorris

Could someone explain to me how that leads to a new model class in terms of capabilities? I'm rather a layman, so I must be missing stuff, but it seems it's just a way to get better data. Would this data then be used to train LLMs just like we do nowadays? No big revolution inference wise (model is still going to get the number of same letter in words wrong)? I can't see how just training current LLMs with better data would lead us to another GPT-4 moment. I was under the impression Strawberry was a new architecture altogether.

renatomoraes

I think that there are several reality check evaluators such as running code environments, math proof checkers, physics simulation etc that is used in the loop when training the model or generating synthetic data. Without a real feedback from a running system, the synthetic data will eventually go sour and the models won't notice

famnyblom

The data they get is the responses to the questions that AI asks. It's 200, 000, 000 individuals daily to ask a question and gain unique perspectives and data back from. This is where AI is getting new data.

LewisDecodesAI

If it works like this, Strawberry will still not be able to count the number of Rs in the word "strawberry".

kkiller

All language can be represented using formal logic (symbolic logic). In my mind reasoners arise when an LLM are able to see not just the text embedding data but the underlying symbolic logic as a linked vector. Theoretically you could create, for every possible piece of training data that makes a statement, a twin written in a formal language of logic. Vectors that point to one should also point to the other in the embedding space. In this way every piece of text that makes a statement could be used to train a reasoning model and because math is a subset of formal logic you would have a model that excels at reasoning. That is and has been my thesis for some time. So to hypothesize alongside you I would say the dataset more resembles this.

jacksonmiller

This is totally random, but does anyone else wish Douglas Adams was still with us to hear his snarky hot takes on AI? I still mourn his passing more than 20 years later. I am super grateful we get to have Dave!

patrickjreid

How OpenAI Strawberry Works ― 'ONE TEXTBOOK TO RULE THEM ALL' ― My Educated Guess

How OpenAI Strawberry Works ― 'ONE TEXTBOOK TO RULE THEM ALL' ― My Educated Guess

OpenAI Strawberry's Innards―How it works, and what comes next for OpenAI

OpenAI’s new “deep-thinking” o1 model crushes coding benchmarks

OpenAI Q* Strawberry - Everything We Know So Far

Explaining OpenAI's o1 Reasoning Models

Is This GPT-5? OpenAI o1 Full Breakdown

OpenAI’s Strawberry EARLY Launch SHOCKS the Internet (Get Ready)

What Is OpenAI Strawberry | Introduction To OpenAI Strawberry Project | Simplilearn

OpenAI Releases Smartest AI Ever & How To Use It

OpenAI Releases GPT Strawberry 🍓 Intelligence Explosion!

Coding with OpenAI o1

OpenAI's CEO on What Kids Should Be Studying

Open AI SHIPS: 'GPT o1' First Look! ('Strawberry' Chain of Thought Reasoning)

Math with OpenAI o1

OpenAI Strawberry aka 01 Is OUT - All You Need To Know!

OpenAI's New AI GPT-o1 STUNS The ENTIRE INDUSTRY Surprises Everyone! (STRAWBERRY RELEASED!)

Counting with OpenAI o1

Reasoning with OpenAI o1

STRAWBERRY - what OpenAI HIDES from us.

Why OpenAI's Strawberry paves the way to AGI

The NEW ChatGPT-o1 has a FUNNY NAME

OpenAI’s Strawberry Q* Project: On the Cusp of Level 2 AGI?

Understanding STaR and how it powers Claude and Gemini/Gemma 2 (and maybe OpenAI Q* or Strawberry)

Cost of a Data Breach 2024 and OpenAI's Project Strawberry