Lesson 18: Deep Learning Foundations to Stable Diffusion

Показать описание

We continue by implementing the OneCycleLR scheduler from PyTorch, which adjusts the learning rate and momentum during training. We also discuss how to improve the architecture of a neural network by making it deeper and wider, introducing ResNets and the concept of residual connections. Finally, we explore various ResNet architectures from the PyTorch Image Models (timm) library and experiment with data augmentation techniques, such as random erasing and test time augmentation.

0:00:00 - Accelerated SGD done in Excel
0:01:35 - Basic SGD
0:10:56 - Momentum
0:15:37 - RMSProp
0:16:35 - Adam
0:20:11 - Adam with annealing tab
0:23:02 - Learning Rate Annealing in PyTorch
0:26:34 - How PyTorch’s Optimizers work?
0:32:44 - How schedulers work?
0:34:32 - Plotting learning rates from a scheduler
0:36:36 - Creating a scheduler callback
0:40:03 - Training with Cosine Annealing
0:42:18 - 1-Cycle learning rate
0:48:26 - HasLearnCB - passing learn as parameter
0:51:01 - Changes from last week, /compare in GitHub
0:52:40 - fastcore’s patch to the Learner with lr_find
0:55:11 - New fit() parameters
0:56:38 - ResNets
1:17:44 - Training the ResNet
1:21:17 - ResNets from timm
1:23:48 - Going wider
1:26:02 - Pooling
1:31:15 - Reducing the number of parameters and megaFLOPS
1:35:34 - Training for longer
1:38:06 - Data Augmentation
1:45:56 - Test Time Augmentation
1:49:22 - Random Erasing
1:55:55 - Random Copying
1:58:52 - Ensembling
2:00:54 - Wrap-up and homework

Many thanks to Francisco Mussari for timestamps and transcription.

Рекомендации по теме

Комментарии

Bam. This lesson is dynamite. So much depth in just one lesson. ❤

mkamp

Around 1:58:00 (Rand copy). To truly preserve the existing distribution we could also copy the patch from a to b, but also copy what was prior to the copy on b to a.

mkamp

the random replace doesn't need to be slices/patches.. it could "swap" individual pixels. even easier to implement

seanriley

Jeramy comments about twitter not existing is quite ept. Its now X

alexkelly

Around 1:36:00, using batchnorm scales the activations, but the activations are also scaled by the weights and with gamma of batch norm. Regularizing the weights of the linear modules becomes ineffective if the model learns to increase gamma? And it would because there is only one gamma parameter per module, but many weight parameters, therefore the gamma penalty is not having too much of an impact on the loss? Is that what Jeremy explains? Also this would be true for LayerNorm as well?

mkamp

Just before you went into copying I was sitting here thinking you could do a random shuffle to maintain the distribution.

It may not matter, but the distribution stil changes when you delete pixels.
After all, now there are more of the ones you copied.

(And I should write this on the forums, but for now I'll write it here lest I forget.)

JensNyborg

Lesson 18: Deep Learning Foundations to Stable Diffusion

Lesson 18: Deep Learning Foundations to Stable Diffusion

Lesson 20: Deep Learning Foundations to Stable Diffusion

Lesson 17: Deep Learning Foundations to Stable Diffusion

Lesson 19: Deep Learning Foundations to Stable Diffusion

Lesson 14: Deep Learning Foundations to Stable Diffusion

Lesson 23: Deep Learning Foundations to Stable Diffusion

Lesson 13: Deep Learning Foundations to Stable Diffusion

Lesson 16: Deep Learning Foundations to Stable Diffusion

Deep Learning chapter 2 : Training Deep Neural Networks | Machine Learning full course in Hindi

Lesson 9: Deep Learning Foundations to Stable Diffusion

Lesson 21: Deep Learning Foundations to Stable Diffusion

Lesson 24: Deep Learning Foundations to Stable Diffusion

Lesson 15: Deep Learning Foundations to Stable Diffusion

Need of Batch Normalization || Lesson 18 || Deep Learning || Learning Monkey ||

Lesson 11 2022: Deep Learning Foundations to Stable Diffusion

Lesson 10: Deep Learning Foundations to Stable Diffusion, 2022

Foundation Engineering ( Chapter 18 Introduction to Deep Foundations Part 1 )

Lesson 12: Deep Learning Foundations to Stable Diffusion

Quantum Computing In 5 Minutes | Quantum Computing Explained | Quantum Computer | Simplilearn

Foundation Engineering ( Chapter 18 Introduction to Deep Foundations Part 2 )

Chapter 18 Fundamentals of Metal forming

What is the most important influence on child development | Tom Weisner | TEDxUCLA

Deep Learning Basics Tutorial | Deep Learning Fundamentals | Deep Learning Training | Simplilearn

Network Physical Layer, Part 1