MIT EI seminar, Hyung Won Chung from OpenAI. 'Don't teach. Incentivize.'

Показать описание

I made this talk last year, when I was thinking about a paradigm shift. This delayed posting is timely as we just released o1, which I believe is a new paradigm. It's a good time to zoom out for high level thinking

I titled the talk “Don’t teach. Incentivize”. We can’t enumerate every single skill we want from an AGI system because there are just too many of them. In my view, the only feasible way is to incentivize the model such that general skills emerge.

I use next token prediction as a running example, interpreting it as a weak incentive structure. It is a massive multitask learning that incentivizes the model to learn a smaller number of general skills to solve trillions of tasks, as opposed to dealing with them individually

If you try to solve tens of tasks with minimal effort possible, then pattern-recognizing each task separately might be easiest.

If you try to solve trillions of tasks, it might be easier to solve them by learning generalizable skills, e.g. language understanding, reasoning.

An analogy I used is extending the old saying: "Give a man a fish, you feed him for a day. Teach him how to fish, you feed him for a lifetime." I go one step further and solve this task with an incentive-based method: "Teach him the taste of fish and make him hungry."

Then he will go out to learn to fish. In doing so, he will learn other skills, such as being patient, learning to read weather, learn about the fish, etc. Some of these skills are general and can be applied to other tasks.

You might think that it takes too long to teach via the incentive instead of direct teaching. That is true for humans, but for machines, we can give more compute to shorten the time. In fact, I'd say this "slower" method allows us to put in more compute.

This has interesting implications for generalist vs specialist tradeoff. Such tradeoff exist for humans because time spent on specializing a topic is time not spent on generalizing. For machines, that doesn’t apply. Some models get to enjoy 10000x more compute.

Another analogy is “Room of spirit and time” from Dragon ball. You train one year inside the room and it is only a day outside. The multiplier is 365. For machines it is a lot higher. So a strong generalist with more compute is often better at special domains than specialists.

I hope this lecture sparks interest in high level thinking, which will be useful in building better perspectives. This in turn will lead to finding more impactful problems to solve.

Hyung Won Chung

Рекомендации по теме

Комментарии

"teach him taste of fish and make him hungry"🤣

hengfun

I like your concept of scaling:
1) identify the modeling assumption or inductive biases that bottlenecks further scaling
2) replace it with a more scalable one.
Example: letting the model learn it's own representations is a more available approach.

JimSlattery

Thanks for the insightful talk!
I love the clarity at 18:50 of seeing the LLM going through training, with so many skills implicitly demanded by the next token prediction task.

JimSlattery

Somehow Hyung Won Chung's talk is always very abstract and purely at a meta level. He doesn't talk about specific LLM techniques or anything like that, but goes all into the fundamental intuition behind the scaling 👍

windmaple

형원님, 정말 멋지십니다. 구독하고 항상 응원할게요!! 앞으로도 많은 영상 부탁드려요

honon-cswl

Great point: The Bitter Lesson article is the single most important writing in the field of AI. 😳

JimSlattery

"No amount of bananas can incentivize monkeys to do mathematical reasoning" lol

chenjus

One of the best talks on YouTube right now!

wwkk

좋은 강의 감사합니다. 끊임없는 배움의 해체(unlearning)에 대해 이야기하신 것이 세상을 보는 눈을 트이게 한 느낌이네요. 잘못된 공리를 바탕으로 구축된 직관과 아이디어가 해체되어야 한다는 이야기를 크게 생각해본 적이 없었으니까요.

SONJOGYO

such an incredibly information dense talk, thank you!

vrushankdesai

멋있어요! 일이 많아서 힘드실거라고 생각이 듭니다. 건강도 챙기시길 바래요

조바이든-rr

yep - we should follow this principle in architecture too!

-mwolf

referenced talk:
John Schulman - Reinforcement Learning from Human Feedback: Progress and Challenges

andrewthomas

It is suprised to know that you majored in mechanical engineering of your phd degree. How can you make such a big move?

hsuai

"MIT EI seminar" reads like "With egg seminar" when you are German, super weird title.... Is it about breakfast? 😂

madrooky

MIT EI seminar, Hyung Won Chung from OpenAI. 'Don't teach. Incentivize.'

MIT EI seminar, Hyung Won Chung from OpenAI. 'Don't teach. Incentivize.'

I-Land Contestants Reactions to BTS V

The Cutest Drawing Interview EVER with Lovely Runners!🏃‍♀️ | Drawing Interview | CJ ENM

LATEST NEWS – Finally Comeback – Netflix Updates – Premieres and Much More #drama #kdrama #netflix...

I Challenged the Best Defender of All Time

The unreal life of G-Dragon, and human Kwon Ji-Yong

The Annoying Orange

BTS Injured, Exhausted And Sick After The Spotlight

My escape from North Korea | Hyeonseo Lee | TED

worried ! Park Hyung Sik leaked Taehyung latest facts at the Defense Command Military Police office

Peter Singer: The why and how of effective altruism

✨Wonderland EP 301 - EP 464 Full Version [MULTI SUB]

'To This Day' ... for the bullied and beautiful | Shane Koyczan

Baby Chris and his travels all over the world with family | Compilation video

Dare to disagree | Margaret Heffernan

[Knowing Bros] JYP Idols Reveal J.Y. Park's Hidden Side 🔥 From Secret Stories to Debut BTS 💞...

Margaret Gould Stewart: How giant websites design for you (and a billion others, too)

[HOT CLIPS] [MASTER IN THE HOUSE ] 'World Class Pitching skill' Ryu Hyun-jin🤗🤗 (ENG SUB)...

BTS news today! BTS Jungkook pickup Jimin on his birthday! What happened?

Can we edit memories? | Amy Milton

Redesigning the Information Space to Unleash the Power of AI [ MIT CSAIL HCI Seminar 2024 ]

[HOT CLIPS] [MASTER IN THE HOUSE ] SEHYUNG, please being scolded more😜 Not me~ It's you🤪(ENG SU...

10 Min DUMBBELL UPPER BODY WORKOUT at Home

The hidden opportunity behind every rejection | Jia Jiang | TEDxMtHood