filmov
tv
Lesson 9: Deep Learning Foundations to Stable Diffusion
Показать описание
We talk about some of the nifty tweaks available when using Stable Diffusion in Diffusers, and show how to use them: guidance scale (for varying the amount the prompt is used), negative prompts (for removing concepts from an image), image initialisation (for starting with an existing image), textual inversion (for adding your own concepts to generated images), Dreambooth (an alternative approach to textual inversion).
The second half of the lesson covers the key concepts involved in Stable Diffusion:
- CLIP embeddings
- The VAE (variational autoencoder)
- Predicting noise with the unet
- Removing noise with schedulers.
0:00 - Introduction
6:38 - This course vs DALL-E 2
10:38 - How to take full advantage of this course
12:14 - Cloud computing options
14:58 - Getting started (Github, notebooks to play with, resources)
20:48 - Diffusion notebook from Hugging Face
26:59 - How stable diffusion works
30:06 - Diffusion notebook (guidance scale, negative prompts, init image, textual inversion, Dreambooth)
45:00 - Stable diffusion explained
53:04 - Math notation correction
1:14:37 - Creating a neural network to predict noise in an image
1:27:46 - Working with images and compressing the data with autoencoders
1:40:12 - Explaining latents that will be input into the unet
1:43:54 - Adding text as one hot encoded input to the noise and drawing (aka guidance)
1:47:06 - How to represent numbers vs text embeddings in our model with CLIP encoders
1:53:13 - CLIP encoder loss function
2:00:55 - Caveat regarding "time steps"
2:07:04 Why don’t we do this all in one step?
Комментарии