Denoising Diffusion Probabilistic Models Code | DDPM Pytorch Implementation

preview_player
Показать описание
In this video I get into Denoising Diffusion Probabilistic Models implementation ( DDPM ) and walk through the complete Denoising Diffusion Probabilistic Models code in pytorch.

I give a quick overview of math behind diffusion models before getting into DDPM implementation.
I cover the denoising diffusion probabilistic models pytorch implementation in 4 parts:
1. Noise scheduler in ddpm - coding forward and reverse process of ddpm in pytorch
2. Model architecture for denoising diffusion probabilistic models - Unet
3. Implementing the unet which can be used in any diffusion models code
4. Training and sampling code of ddpm
5. Results of training ddpm

Timestamps:
00:00 Intro
00:30 Denoising Diffusion Probabilistic Models Math Review
03:15 Noise Scheduler for DDPM
04:30 Noise Scheduler Pytorch Code for DDPM
07:10 Denoising Diffusion Probabilistic Models Architecture
08:10 Time embedding Block for DDPM Implementation
08:54 Overview of Unet Architecture for DDPM
09:49 Downblock of DDPM Unet
11:34 Midblock and Upblock for DDPM Unet
12:40 Code for Positional Embedding in DDPM in Pytorch
14:07 Code for Downblock in DDPM Unet
16:42 Code for Mid and Upblock in DDPM Unet
18:53 Unet class for DDPM
22:04 Code for Diffusion Model training
22:47 Code for Sampling in Denoising Diffusion Probabilistic Model
23:24 Configurable Code
24:15 Dataset for training
24:56 Results after DDPM training
25:42 Thank you

📄 Code Repository:

🔔 Subscribe :

Background Track - Fruits of Life by Jimena Contreras

🔗 Related Tags:
#DDPM #DiffusionModels #DDPMImplementation #GenerativeAI
Рекомендации по теме
Комментарии
Автор

I am very thankful for your nice video; it's the best explanation of the diffusion model I have seen!

zhuangzhuanghe
Автор

Nicely explained! Keep the good work going! 😁

prathameshdinkar
Автор

Hi, amazing explanation! Thanks for all the efforts you put into making the video.
Can you please share the details of the UNet model that you've used (maybe a link to a paper/blog)? Thank you!

purnavindhya
Автор

Hi there, thanks for the video, may I ask a question: to my understanding, the multi-headed attention first applies 3 ff networks for key, query, and value, and in this model, you applied multiheaded attention on the image where channels play as sequence length and flattened image plays as the token_length that should mean that the query network for example should be a Linear(token_length/4, token_length/4) which means its parameter count should be = ((h*w)**2)/16 which is huge, or am I wrong?

binyaminramati
Автор

very well explained. what changes would we need to make if we used our own dataset? specifically greyscale

muhammadawais
Автор

Thanks for the very informative video! I am having trouble with using my own dataset in this. I'm doing this on a macbook in google colab. Currently, I have mounted my drive to the colab and pulled in my dataset from my drive, through the default.yaml. However, I am getting an error, saying that num_samples should be positive, and not 0. I am not sure what you mean by "Put the image files in a folder created within the repo root (example: data/images/*.png ).". What is this repo root and where can I find it? Is it local on my computer? Could you help with this? Thank you in advance!

xdhanav
Автор

@Explaining-AI
Sorry to bother you but I don't know why but whenever I am training on any dataset, I tried mnist, cifar10 etc but mse loss is always nan. Is this expected, I checked my transformation. It is correct, first transforms.ToTensor(), and transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]). All the losses are nan values, will the model learn anything meaningful?

takihasan
Автор

I am getting a Cuda out of memory error when used on my own dataset. The dataset consists of .npy files

paramthakkar
Автор

Amazing explaination. But i have a question that i want to train on my custom rgb data with the shapr 128x128 or 256x256, buy i always gave the results of outofmemory, but the training params is inly about 10m params. Can you help with that?

colder
Автор

Thank you so much for the video. It was amazing and your video explained many things that I couldn't understand anywhere. Though I have a question regarding the up channels. You have given down channels as [32, 64, 128, 256]. As per your code the channels for the first upsample will be (256, 64) but after concatenating from the last down layer the number of channels for the first convolution of the resnet layer should be 128 + 256 = 384 but as per your code it is 256. The same thing will happen for each upblock. In second case 128 + 64 should be the in channels but as per your code 128, and the third upsample layer should have in channels 64 + 32 = 96 but as per your code it is 64. I think there is little miscalculation.

takihasan
Автор

hi Sir, i would like to request you kindly make changing in the stable diffusion model repository regarding size of the images because this repository is not supporting high image size and required very high GPU memory like for 256 size images its required almost 200Gb which is high cost effective. also if possible include few evaluation metrics for quantitative analysis between the original and the generated images. waiting for the next video!

muhammadawais
join shbcf.ru