Scaling Laws of AI explained | Dario Amodei and Lex Fridman

preview_player
Показать описание
See below for guest bio, links, and to give feedback, submit questions, contact Lex, etc.

*GUEST BIO:*
Dario Amodei is the CEO of Anthropic, the company that created Claude. Amanda Askell is an AI researcher working on Claude's character and personality. Chris Olah is an AI researcher working on mechanistic interpretability.

*CONTACT LEX:*

*EPISODE LINKS:*

*SPONSORS:*
To support this podcast, check out our sponsors & get discounts:
*Encord:* AI tooling for annotation & data management.
*Notion:* Note-taking and team collaboration.
*Shopify:* Sell stuff online.
*BetterHelp:* Online therapy and counseling.
*LMNT:* Zero-sugar electrolyte drink mix.

*PODCAST LINKS:*

*SOCIAL LINKS:*
Рекомендации по теме
Комментарии
Автор

See below for guest bio, links, and to give feedback, submit questions, contact Lex, etc.

*GUEST BIO:*
Dario Amodei is the CEO of Anthropic, the company that created Claude. Amanda Askell is an AI researcher working on Claude's character and personality. Chris Olah is an AI researcher working on mechanistic interpretability.

*CONTACT LEX:*

*EPISODE LINKS:*

*SPONSORS:*
To support this podcast, check out our sponsors & get discounts:
*Encord:* AI tooling for annotation & data management.
*Notion:* Note-taking and team collaboration.
*Shopify:* Sell stuff online.
*BetterHelp:* Online therapy and counseling.
*LMNT:* Zero-sugar electrolyte drink mix.

*PODCAST LINKS:*

*SOCIAL LINKS:*

LexClips
Автор

The most likely reason that LLM models are running up against the theoretical limits proposed and verified in the scaling laws is a function of current neural networks, there will at some point become more efficient means for computers to "learn" and process data. When the data is presented in a log scale it becomes very apparent that new LLM models are only marginally improved even though the number of petaflops used in training has increased exponentially.

MagnetarCO
Автор

I think I might have an explanation for the firm boundary that llm's seem to hit. My own research using texts of all subject matter, across as many languages as i could and searching for their semantic meaning using SentenceTransformer to generate 768 dimensional embeddings and reducing them with multiple reduction methods(PCA, UMAP, t-SNE, Isomap) shows that that information and language have a geometrical metastructure, aperiodic and regularly oscillating and fractalizing. but adding more data doesn't diversify this shape, it only further defines it.... Eigenvalues show that almost all of the semantic nuance is captured in the first few dimensions and the rest of the 768 just polish away noise. The llms hit a limit because meaning has structure in semantic space and is in fact not arbitrary nor combinatorially explosive because while its aperiodic, it has a calculable pattern and firm mathematical criteria for what constitutes a meaningful connection, even if its abstract.. Anyway, I have all the math and comprehensive visualizations and proof that this is the case, but Academia isn't interested. What I've found could solve a great number of problems potentially.

justinrose
Автор

The missing link that causes a slowdown in AI Models intelligence is the lack of training sets originating in human environments. Lack of stereovision and scale causes glitches in recreation of visual artifacts (6 fingers humans, facial morphism for the same character...) Also, it lacks physical interaction and true daily intellectual communication in a physical context. It is something that will only be improved through the introduction of robots in human environments or training with advanced synthetic data.
Another challenge lies in the Question-answer format. The model forms an answer solely based on the question. Few models ask precisions about missing axioms or ambiguity in the questioning.

PierreH
Автор

But what about the current talks about scaling laws hitting a bottle neck? Orion seems to underperform

JeffreyWongOfficial
Автор

Good to see software guys also dealing with 1/f noise

Fisherdec
Автор

It seems obvious that you do a better job of spanning the space with more data and connectivity to record that information. The interesting part is form of that approach to the “limit”.

mikefredd
Автор

legalpdf AI fixes this. Scaling Laws of AI explained

EtsukoJasper
Автор

8:25
8:49 8:58 9:01 9:04 9:07 9:13 9:17 9:21 9:27 9:30 9:37 9:45 9:53 9:59
“Domain Dependent”

“Suggests to understand our domain or become beholden to remain within “

Why scale up into a void of ignorance?

The desire to be released from ignorance while spreading it around is domain dependent

Jeremy

Jeremy-Ai
Автор

Well I'm a 20 years coder and am using 4.o and let me tell you the level of coding he's describing at 16:45 is nowhere to be found. Here is ChatGPT response:"Why People Make Claims About AI Replacing Coders?
1. Hype and Overpromises: Media and industry leaders sometimes exaggerate the potential of AI, leading people to believe it can do everything a human can. The truth is that AI has limitations—it can help with certain tasks, like generating code snippets, providing documentation, or solving specific, "

MichaGero
Автор

I wish I could hear more on the impact of Quantum computing on A.I. Love the content btw.

kamartaylor
Автор

I think the data problem will be solved once we have agents with good computer interfaces. Just like AlphaGo, these agents will go and generate their own datasets by interacting with at first, the digital world, then with the physical world through robotic embodiment.

aelisenko
Автор

9:24 "no ceiling below the level of humans". What I thought as well.

Merializer
Автор

Bigger doesn't always mean better. The scaling hypothesis discussed in the video highlights the value of making AI systems larger to achieve better performance. However, we should also consider a different approach: smaller, more versatile, and efficient AI models. Instead of focusing solely on massive networks and huge compute, imagine an AI designed to master the basics consistently.

What if we had a compact, general-purpose AI capable of running on personal computers or mobile devices, with features like multimodality? This small but capable AI could handle practical tasks like moving a mouse, doing chores, controlling a 3D character, or operating a robot.

Think beyond the screen: imagine this AI in your smartphone, remotely controlling a car, robot, or motorcycle via Bluetooth or the internet. The limits dissolve when the power lies in efficient design rather than size. A small AI trained in diverse tasks—coding, movement, image, audio, and human-like interactions—could transform accessibility, making powerful AI affordable and practical for everyone.

We shouldn't only pursue "bigger and better" scaling; instead, we need to invest in focused, streamlined models that dominate everyday applications. The future might not just be about scaling but about making intelligence compact, affordable, and universally useful.

antoniobortoni
Автор

It's weird how more data and faster computing would increase capability... this guy must be really smart

justinsheppard
Автор

Dude's literally restraining the instinct to critique humanity at large here. Elephant in the room is.. what are the limits of our capacity to imagine the limits of a superior intelligence - since we can't imagine what we don't have the means of doing so, in the way a dog can hardly comprehend why he/she may be yelled at or worse for certain behaviors... call it an assumption if you want but I doubt we'll ever have a clue what AI could be capable of because models don't curate their own training data... It's not like you can just hook it up to a set of artificial sense organs and let it shape it's own discourse/collection of data. There will always be the limitations imposed on by the bias of the data selected for training.

JacksonHansen-sr
Автор

But what about the news surrounding Orion not predictably being better than gpt4 in a variety of tasks as well as the implications that also other sources talked about regarding the current techniques hittting a bottle neck

whttodonow
Автор

My cluster is running a 3b llm. A raspberry pi 5 cluster running off solar. The ryzen 7 5825 was good for a 7, 11b, and a 3b.

Unineil
Автор

Bigger networks can map more meaning as vectors into the hyper-dimensional space.

vegnagunL
Автор

9:40, has anyone asked AI what else it need to grow more

CCC