How Diffusion Works for Text

preview_player
Показать описание
We dive into the Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution paper, a technique, competitive with GPT-2, that can use diffusion techniques to generate text.

--

Oxen AI makes versioning your datasets as easy as versioning your code! Even is millions of unstructured images, the tool quickly handles any type of data so you can build cutting-edge AI.

--

--

Chapters
0:00 Intro
3:55 Modeling Probability Distributions for Generative AI
7:12 Problem #1: No Black Box
10:44 Solution #1: Train a Network to Approximate the Probability Mass Function
13:48 Problem #2: The Normalizing Constant, Z_theta, is Intractable
15:15 Solution #2: Autoregressive Modeling
17:15 Solution #3 (Real Solution): Model Score, Not Probability Mass
25:50 Learning the Concrete Score Through Diffusion
33:00 Evaluation
36:18 So What?
41:00 Takeaways
Рекомендации по теме
Комментарии
Автор

Diffusion text noise could improve reasoning. A kind of overview of the problem, instead of trying to guess just the next token. If you make an oopsie at the start it can quickly compound later-on with autoregression. Being able to go back and forth must be a huge boost. I could see a model in the future where the question to an answer is put like "therefore answer to [question asked] must be" at the end of the noise to force it to answer.
It's also a step into the direction of explainability.

BooleanDisorder
Автор

Could this potentially improve function calling and adherence to certain output formats, e.g., JSON?

DanielPramel
Автор

couldnt the diffusion pertubations happen on the embedding vector level - as suggested in one of the questions - and a nearest neighbor search be used to predict a vector that resembles an actual token?

jensg
Автор

Thanks for the great post! Are there a practical implementation video now?

SiminFan
Автор

Tesla diffusion model taught itself to read street signs.

rogerc