How does DALL-E 2 actually work?

preview_player
Показать описание
DALL-E 2 has arrived in the AI world with a bang. It is one of the best generative models we have seen to date. But how does this magical model work? In this video, we will take a look into the architecture of DALL-E 2, to understand the working principles.

00:00 Overview
00:34 What can DALL-E 2 do?
00:55 Architecture overview
01:27 CLIP embeddings
03:05 The prior
04:24 Why do we need the prior?
05:20 The decoder
06:13 How are variations created?
06:56 Model evaluation
07:36 Limitations and risks of DALL-E 2
09:21 Benefits of DALL-E 2
10:00 A question for you!

Would you like to read this information instead? Check out the blog post on DALL-E 2 👇

▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#MachineLearning #DeepLearning
Рекомендации по теме
Комментарии
Автор

that' s pretty impressive technology, this tech has a nice future ahead and the narrator is outsmartly explains pretty much everything

smritirani
Автор

I am deeply interested in how AI can learn and become connected with our internet. Thank you for showing us the cutting edge of that! I hope this program is a massive success!

silasmerrell
Автор

Nice explanation! I would like to know more about diffusion models in depth. They seem completely different from GANs.

sumansaha
Автор

Super clear explanation, great work!!!

lodewijk.
Автор

Of course, it is named after DALL-E 1, which in turn is named after Salvador Dali and Wall-E

tednoob
Автор

I still don't get HOW it creates the image. I mean, does it use existing images, does it "paint" them, how does it know about shadows, reflections on completely original images. That's what blows my mind about this

DanielRieger
Автор

Actually, I was trying to figure out academic explanation for dall e and you explain it perfectly!Now I get the point

elhamkeshavarzarshadi
Автор

Hello everyone,
Thank you for this presentation.
I have a question: Does DALL-E 3 work the same way as DALL-E 2? Is it the same architecture and technical components? What is the technical difference between the two?

dhifallahothmen
Автор

Thank you very much for this video. It is unbelievable that they basically solved human creativity and they did it with a simple model. This is an earthquake for a whole industry. The progress compared to DALLE.1 in just one year is unbelievable. What will they show us next year?

oholimoli
Автор

I would love to know more about how the diffusion model understands the "grammar" of an image... I understand how recreating an existing image could work (conceptually), but I cannot grasp how it can use diffusion to generate a meaningful (as in coherent) image that has never existed before.
Would someone recommend any further readings or videos on this, please?

EmanuelGaldr
Автор

6:43 that's pretty cool, but the computer didn't manage to make any of the clocks floppy :P

tristimulus
Автор

Thanks for the video. If the decoder is glide, what model is the Prior? Is it Vae based?

salomeshunamon
Автор

Wow, that was a fantastic explanation, thank you !

Kurzrein
Автор

an excellent lecturer, immeasurable gratitude for the knowledge you share /\

Dharma_
Автор

> What is DALL-E 2 named after?

DALL-E 2 is named after DALL-E 1. HTH

StefanReich
Автор

One of the simplest descriptions I've found for this. Nice job.

culpritdesign
Автор

Can we access the pre-trained model and fine-tune for our own dataset?

marverickbin
Автор

Lets say we enter a text "A dolphin wearing glasses".
Does it fetch source images of a dolphin and glasses from it's large database and crop the objects from 2 sources and combine them in a new image? Then what diffusion does? How does adding noise to an image help in this process? Why reverse the adding noise process when we have the original crisp image?

taseronify
Автор

What is the name DALL-E 2 based on? Any guesses? Leave a comment!

AssemblyAI
Автор

When and how will this be available to the public? Will there be an iPad app?

kevinsturges