Diffusion Models - Live Coding Tutorial

preview_player
Показать описание
This is my live (to the most extent) coding video, where I implement from a scratch a diffusion model that generates 32 x 32 RGB images. The tutorial assumes a basic knowledge of deep learning and Python.

Links:

Sources:

Timestamps:
0:00 Introduction
0:32 Theoretical background
13:13 Live Coding - Forward diffusion
41:29 Live Coding - Training loop
1:00:05 - Live Coding - Overfitting one batch
1:03:36 - Live Coding - Reverse diffusion
1:13:40 - Live Coding - Training on CIFAR - 10 dataset
1:17:24 - Live Coding - Result evaluation
1:19:40 - (Bonus) Quick explanation of the UNet architecture used in the tutorial
Рекомендации по теме
Комментарии
Автор

Thanks man, I really appreciate your work

adeolaogunleye
Автор

I have looked almost every video on this subject and this is by far the best approach, it's simple enough to be well understood but it gives all the tools to built more advanced models. I wish you could do a remake of this one because sometimes the code snippet is out of frame and sometimes its hard to read because of the font size. Thx a lot for this upload!

outroutono
Автор

Thanks for sharing your work with us, Appreciate!,

bbbaaa
Автор

Good tutorial, just wished that we could see the screen while you're coding, as most of the new lines you added were off-screen :/ Keep it up!

danielfirebanks
Автор

Thanks a lot! I really appreciate.
This tutorial explain clearly. Awesome!
Hope to see more tutorial vedios on your youtube channel, thanks.

chichi
Автор

Great tutorial. Thanks for sharing.
Please make slightly advanced tutorials, like Conditonal (Image or Text) Generation of Images using Diffusion.
I see that there are very few advanced tutorials by any Youtuber.

kanakraj
Автор

Better font, but still can’t read not only the phone, that is main content consuming device, but even on my 13 inch MacBook. God bless I have 55 inch tv I can watch on. Even with such struggles I will continue to watch such a diamond video!
Thanks for video! Great content!

VitaliyHAN
Автор

Thank for this video. Can you make video about apply high resolution for this project ?

duyquangnguyen
Автор

You should have zoomed in the screen more so that its visible properly. Still appreciate your efforts! Nice vid.

paneercheeseparatha
Автор

Hi, thanks for the video. But can you explain the part on how you introduce the positional encoding to the network? Also, can this model work for a feed forward neural network rather than a U-net ?

anshumansinha
Автор

Thanks for tutorial.
Why posterior_variance_t = betas_t? Shouldn't it be equal to betas_t*(1 - alphas_cumprod_t_minus_1)/(1 - alphas_cumprod_t) according [Lil' Log]?

jflimnl
Автор

Is there a difference between `result = alpha_hat.gather(-1, t)` and `result = alpha_hat[t]` ?

brunokemmer
Автор

Coming from a programming background, I always find it very strange to name variables by generic Greek letters or just X, Y. I am not criticizing your video specifically, it is a pattern that is very wide spread. But for example, you are naming the first parameter to the forward_diffusion function "x0". is it to save space? is it because you think it is easier to reference it from the mathematical formulas?
In my mind it would be much more clear if "x0" would be named "image". or am I misunderstanding your explanation maybe.
As I mentioned, I don't think your video is bad. I'm just curious as to why it is so common that code related to machine learning is generally so generically named.

nqvst
Автор

Can you make a Image to Image tutorial?

chiscoduran
Автор

can u say why output was not as fascinating and what can be done from here to make output clearer @dtransposed79

playmaker
Автор

Thanks man, I really appreciate your work

utxuebc