Language Models as World Models

Показать описание

Jacob Andreas, MIT

Рекомендации по теме

Комментарии

its interesting the guessing games played by these researcheers !
We have a neural netwoork which is a collectio of regression tress as well as word matrixes : So it uses the regression to predict the next word given the matrixes for each position in the sequence :
SO we trained the same model to produce outpuits based on questions and fed it im=nputs and outputs and forced to to match the output given the input !
so we have amny of this too !
so we need to understand that the neural network did not changed and it can be used to move a robotic arm !
as what is a neural network !> it predicts based on past actions so it is picking the highest probable output given the input !
by training the model on multiple tasks .... who knew that it could maintain past tasks :
Such tat themodel used to preditnumbers based on handwriting, can also be trained to answer a question !
so we can findout that what is actualy fgoing on is Regression !
as we can map nearly any task to t regresion model which is the structure of the neural network : the transformer uses word matrixes as a state ! but to drive a car we would have a diffeent state and to generate a sound or image we owuld have a different state !
so to make the model very versitile its all about what state we can pass throught the modle and it will produce regression tress based on this state at various layers !
so we find that the layer coul=nt can help with ttransformationa dn the more comlexed the more layers arerrequired ! today we have found that with the transformer model as long as we can place the satte in TEXT format we can use this model for w=varrious types of predictive asks ... so what is the state inside the mdel now ?? is it word to word matrixes a? NO!
its tenirs and vectors !
so a mathmatically represented data can be regressed and predicted ! ....
right now we are only using tesors of massive width to represnt the massive state of the sequence but it could be smallerr !

So we find that the attentoon is all you need is a very important step in transofmrer allowing for rretaergeting of the expected output and keeping the modle from straying from the ctual expected outcome ! --- -these variious attention methods are the diversive factor in the network as at these lcation the state is what is attended too: it is rewoven innto the current layer or step !
this allows us to have many layers! and gradually changing the output as it passes through these layers : we find interestingly that we can take outputs qhich are vaild from various layers !
hence the attentionn layers are actualy doing more than regression !

xspydazx

Language Models as World Models

Language Models as World Models

Language Models as World Models?

Introducing General World Models | Runway

Language Models meet World Models (and Agent Models)

Why Are There So Many Foundation Models?

LLM Understanding: 15 J. TENENBAUM 'Word Models to World Models: Probabilistic Language of Tho...

Large Language Models explained briefly

THIS is why large language models can understand the world

Language Models and the Compression of Chaos

Michael Franke -- Understanding Language Models: On Japanese Rooms & Minimal World Models

Jacob Andreas: Language Models as World Models

The Rise of Small Language Models to Cut AI Costs

Probing Multimodal LLMs as World Models for Driving

How Large Language Models Work

What is a world model? #shorts #aimodel #artificialintelligence #ytshorts

Lecture 11.3: World Models

General World Models: The Future of AI

Language Models are 'Modelling The World'

Yann LeCun Explains The World Model

Can Large Language Models Understand Meaning?

LLM generates the ENTIRE output at once (world's first diffusion LLM)

WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation

Yann LeCun: Is The World Model Perfect?

How AI Models 'Reason'