How OpenAI Strawberry Works ― 'ONE TEXTBOOK TO RULE THEM ALL' ― My Educated Guess

preview_player
Показать описание

WRITING STUFF
Рекомендации по теме
Комментарии
Автор


I was building and finetuning chatbots back on GPT-3, before ChatGPT.

DaveShap
Автор

"Hello I'm awake and I don't wanna be because I couldn't shut my brain off". Of the many very relatable things you've said, this may be the MOST relatable.

Nex_Addo
Автор

Oh WOW!! This is what I'm doing with Sonnet to "prime it" while coding.
I'masking questions about how a feature(s) of a library work and to give me a simple example.
Then I tell it to modify my code, using that feature, and give it my requirements.
It's been working much better than just asking for my required changes.

billybob
Автор

So the right brain that knows all, the left brain that is analytical, and prefrontal cortex that controls the output

AetherXIV
Автор

Solidify your knowing with conviction;
Liquify your knowing through connection;
Find your resonance between the two

INFP-Insights
Автор

I figured our quite quickly what Strawberry is - a marketing campaign.

Where you have a solid product, you base your campaign on how solid it is. When you don't have a solid product, then you give hints and let people fill the caps themselves, avoiding any accountability for their presumptions.

dopaminefield
Автор

A friend who is an insider told me synthetic data was generated by generation of variations of real reference text data and other media.
Instead of reading once the encyclopedia they rewrite it multiple times and reformulate, and so they have a very low rate of hallucination.

NO-TALK-GuitarPlugin
Автор

0:45 the synthetic data video is private.

sparker
Автор

Thank you very much for this video! I have been trying to generate synthetic data from llama 3 and was kind of following this model but this is a much more organized way.

michaelmeglasson
Автор

Not quite. Strawberry does the following = finds questions to answer on the web (think of it like end-of-a-chapter questions in a college textbook, and many others). Then it truly answers them. The answering logic depends on some kind of tree search, trial and error, and a critique model than can check correctness very accurately. Once a true answer is found, it's recorded. And the step by step derivation to reach the true answer now becomes the synthetic next-token prediction task (pretraining data) for the LLM to train on. I believe this is how the next gen LLMs will be built -- using self-generated step by step problem solving synthetic data.

MrStarchild
Автор

I know we sometimes forget how revolutionary this is but even like 4 years ago, in 2020 this kind of a thing would feel like sci-fi - that you can ask a computer program to essentially write you chapters of textbook on physics. Yes it hallucinates etc. but looking back if you transported a person from the time Covid started and showed him/her this they'd be blown away of what is possible just in 4 years

lkrnpk
Автор

Love these "more technical" videos, they're the reason I subbed originally. I enjoy the ride when a "philosophical" one comes out, but I'm less likely to click on it.

tablen
Автор

The light bulbs are popping for us too David, thanks for being the catalyst you are! 🙏

brianhershey
Автор

Considering all the talk of Q*/Strawberry being a breakthrough in * reasoning *, and some talk of it involving inference-time compute... I suspect that while said breakthrough is being * used * to do these tasks, likely to improve the quality of Gpt-5/Orion without the need for slow inference time, I doubt this is how Strawberry was "works". While the technique you describe would likely do a lot to reduce hallucinations, I don't think it would lead to a * reasoning * breakthrough unless, maybe, they made a textbook of correct and incorrect leaps of logic, as opposed to the broad library of human information that you describe.

kylemorris
Автор

Could someone explain to me how that leads to a new model class in terms of capabilities? I'm rather a layman, so I must be missing stuff, but it seems it's just a way to get better data. Would this data then be used to train LLMs just like we do nowadays? No big revolution inference wise (model is still going to get the number of same letter in words wrong)? I can't see how just training current LLMs with better data would lead us to another GPT-4 moment. I was under the impression Strawberry was a new architecture altogether.

renatomoraes
Автор

I think that there are several reality check evaluators such as running code environments, math proof checkers, physics simulation etc that is used in the loop when training the model or generating synthetic data. Without a real feedback from a running system, the synthetic data will eventually go sour and the models won't notice

famnyblom
Автор

The data they get is the responses to the questions that AI asks. It's 200, 000, 000 individuals daily to ask a question and gain unique perspectives and data back from. This is where AI is getting new data.

LewisDecodesAI
Автор

If it works like this, Strawberry will still not be able to count the number of Rs in the word "strawberry".

kkiller
Автор

All language can be represented using formal logic (symbolic logic). In my mind reasoners arise when an LLM are able to see not just the text embedding data but the underlying symbolic logic as a linked vector. Theoretically you could create, for every possible piece of training data that makes a statement, a twin written in a formal language of logic. Vectors that point to one should also point to the other in the embedding space. In this way every piece of text that makes a statement could be used to train a reasoning model and because math is a subset of formal logic you would have a model that excels at reasoning. That is and has been my thesis for some time. So to hypothesize alongside you I would say the dataset more resembles this.

jacksonmiller
Автор

This is totally random, but does anyone else wish Douglas Adams was still with us to hear his snarky hot takes on AI? I still mourn his passing more than 20 years later. I am super grateful we get to have Dave!

patrickjreid