Lesson 22: Deep Learning Foundations to Stable Diffusion

preview_player
Показать описание
Oops I say it's "Lesson 21" at the start of the video -- but actually this is lesson 22!

The lesson covers various sampling techniques, such as the Euler sampler, Ancestral Euler sampler, and Heuns method. Jeremy explains the concepts behind these methods and demonstrates how they can be used to improve the sampling process. He emphasizes the importance of understanding the underlying concepts and techniques in research papers and demonstrates how these can be applied to improve model performance.

00:00 - Intro
00:30 - Cosine Schedule (22_cosine)
06:05 - Sampling
09:37 - Summary / Notation
10:42 - Predicting the noise level of noisy Fashion MNIST images
12:57 - Why .logit() when predicting alpha bar t
14:50 - Random baseline
16:40 - mse_loss why .flatten()
17:30 - Model & results
19:03 - Why are we trying to predict the noise level?
20:10 - Training diffusion without t - first attempt
22:58 - Why it isn’t working?
27:02 - Debugging (summary)
29:29 - Bug in ddpm - paper that cast some light on the issue
38:40 - Karras (Elucidating the Design Space of Diffusion - Based Generative Models)
49:47 - Picture of target images
52:48 - Scaling problem - (scalings)
59:42 - Training and predictions of modified model
1:03:49 - Sampling
1:06:05 - Sampling: Problems of composition
1:07:40 - Sampling: Rationale for rho selection
1:09:40 - Sampling: Denosing
1:15:26 - Sampling: Heun’s method fid: 0.972
1:19:00 - Sampling: LMS sampler
1:20:00 - Kerras Summary
1:23:00 - Comparison of different approaches
1:25:00 - Next lessons

Timestamps thanks to Piotr Czapla. Transcript thanks to Francisco Mussari
Рекомендации по теме
Комментарии
Автор

I have a preliminary chapters. Let's see if YouTube let me add them so that it is easier to improve on them.

Chapters
00:00 - Intro
00:30 - Cosine Schedule (22_cosine)
06:05 - Sampling
09:37 - Summary / Notation
10:42 - Pedicting the noise level of noisy Fashion MNIST images (22_noise-pred)
12:57 - Why .logit() when predicting alpha bar t
14:50 - Random baseline
16:40 - mse_loss why .flatten()
17:30 - Model & results
19:03 - Why are we trying to predict the noise level?
20:10 - Training diffiusion without t - first attempt
22:58 - Why it isn’t working?
27:02 - Debugging (summmary)
29:29 - Bug in ddpm - paper that cast some light on the issue
38:40 - Kerras (Elucidating the Design Space of Diffusion - Based Generative Models)
49:47 - Picture of target images
52:48 - Scaling problem - (scalings)
59:42 - Training and predictions of modified model
1:03:49 - Sampling
1:06:05 - Sampling: Problems of composition
1:07:40 - Sampling: Rationale for rho selection
1:09:40 - Sampling: Denosing
1:15:26 - Sampling: Heun’s method fid: 0.972
1:19:00 - Sampling: LMS sampler
1:20:00 - Kerras Summary
1:23:00 - Comparison of different approaches
1:25:00 - Next lessons

piotr.czapla