How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

Показать описание

AI image generators are massive, but how are they creating such interesting images? Dr Mike Pound explains what's going on.

Thumbnail image partly created by DALL-E with the prompt: "Computerphile YouTube Video presenter Mike Pound Explains Diffusion AI methods thumbnail with green computer style title text on a black background with grey binary"

This video was filmed and edited by Sean Riley.

Рекомендации по теме

Комментарии

Glad to have finally found someone I can actually listen to about AI, someone that doesn't hype things up and isn't trying to sell me something.

InfinityDz

Stable diffusion doesn't actually actually apply noise to images, it uses a compressed low dimensional latent representation of the image and applies noise to that. The model is running in this abstract latent space, and then the autoencoder recreates the image afterwards.

Qman

A deep dive on the google colab code would be amazing!

ayushdhar

Came here by accident and man, aren't you the gifted one? I was engrossed in the video knowing barely anything about the technologies and techniques uses, and I don't feel dumber -- that's an achievement :)
Thanks again, will pop here often.

kgsz

Finally! Ever since Stable Diffusion was released I was looking for an explainer on how it worked that wasn't "Oh it generates images from noise" or something that went too deep into technicals that I didn't understand.

Very beautifully explained Dr. Mike Pound! Hope you do another video where you dive into the code where we can see the parts which were visualized here.

One thing that's still unclear to me is how was the network trained to relate text with images and how does it utilize this information when actually producing images?

wlockuz

the explanation sounds like magic. It is like a sculptor saying he just chips away pieces of the stone until he finds the horse hidden inside.

beachdancer

So stable diffusion is just the AI version of that sculpting joke: Start with a big block and take away the parts that dont fit

Ultimatro

Can't believe Mike can effortlessly make that shape with his hand (little finger) at 5:37

serhat

I couldn't agree more! Since the release of Stable Diffusion, I've been searching for an explanation that strikes the right balance between simplicity and technicality. Your video did an excellent job of providing a clear understanding without overwhelming us with excessive technical details. Dr. Mike Pound, you have a remarkable talent for explaining complex topics in a beautifully straightforward manner!

aijeveryday_guy

Oh i DEFINITELY want to see mike's deep dive into the code!

juliankandlhofer

Add noise to images and train a model to undo that addition.. then you have something that maps from noise to images.

One thing I find so impressive about these researchers.. is that they would try this. It’s so bizarre.. just because, from a distance, it’s not at all clear that such a task is doable.

Mutual_Information

Would have been nice hear a bit more about the "gpt-style transformer embedding". Wouldn't those classifications have to be included in the training data already?

carlborgen

"I saw the angel in the marble and carved until I set him free. ” - Michelangelo

housellama

12:58 I'd like to hear more about that GPT-style transformer embedding of text. Was text part of the training set?

dileepvr

I tried to guess how these things work. Now I'm taking the difference between my guess and this explanation and feeding it to my neurons. Thanks!

Zothaqqua

Been listening to house music in the background (on the low down) when the odd watching computerphile / numberphile for quite a while now.
Thought it was time to fess up.
Vibing it is probably just me on this tip.

chaoslab

Wow! Had not seen listing paper since my dad was trying to teach me basic on a commodore 64. Had no idea it was still a thing. Big jump from having to read code on paper to make sense of it to this.

danieletorrigiani

I followed some of that.. but some of that also sounded a lot like Michelangelo's "start with the block of marble and carve away everything that doesnt look like "X." I will come back to watch this again after the first watching settles! Thank you for providing this.

rickhobson

My favorite part is where he explains AI while drawing on printer paper from 1989 XD

user__

I would love to hear more about the process. Like how does it recognize that the image now looks like a frog on stilts? Seems to me like that's where the real complexity is.

patu

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

MIT CSAIL Researcher Explains: AI Image Generators

AI art, explained

How Stable Diffusion Works (AI Image Generation)

How Dall-E 2 and Other AI Art Generators Create Images From Text | WSJ

Explained simply: How does AI create art?

How AI 'Understands' Images (CLIP) - Computerphile

Text-to-image generation explained

🎸Quotations Café Créatif✨🤖 Magic AI Art - Café Créatif #badgecanva #englishlanguage#chilloutmusic...

How To Generate INSANE AI Art For Beginners (Midjourney V4 )

Stable Diffusion in Code (AI Image Generation) - Computerphile

An AI artist explains his workflow

STOP Using Midjourney, Try This FREE AI Image Generator Instead!

How Stable Diffusion Works (AI Text To Image Explained)

DALL·E 2, Stable Diffusion, Midjourney: How do AI art generators work, and should artists fear …

How Do AI Image Generators Work? #ai

Why AI art struggles with hands

Why Artists are Fed Up with AI Art.

How! AI Image Generators Work: Stable Diffusion

How AI Image Generators Work? | NeuronRush

How to Use Midjourney - Ai Text To Image Generator - Beginner's Guide

Secrets to Creating Stunning AI Images: Expert Prompts

How to Generate this AI trending Images for Free

The Current Absurd State of Generating AI Images