filmov
tv
Lesson 17: Deep Learning Foundations to Stable Diffusion
Показать описание
We also cover variance, standard deviation, and covariance, and their significance in understanding relationships between data points. We create a novel Generalized ReLU activation function and discuss the Layer-wise Sequential Unit Variance (LSUV) technique for initializing any neural network. We explore normalization techniques, such as Layer Normalization and Batch Normalization, and briefly mention other normalization methods like Instance Norm and Group Norm.
Finally, we experiment with different batch sizes, learning rates, and optimizers like Accelerated SGD, RMSProp, and Adam to improve performance.
0:00:00 - Changes to previous lesson
0:07:50 - Trying to get 90% accuracy on Fashion-MNIST
0:11:58 - Jupyter notebooks and GPU memory
0:14:59 - Autoencoder or Classifier
0:16:05 - Why do we need a mean of 0 and standard deviation of 1?
0:21:21 - What exactly do we mean by variance?
0:25:56 - Covariance
0:29:33 - Xavier Glorot initialization
0:35:27 - ReLU and Kaiming He initialization
0:36:52 - Applying an init function
0:38:59 - Learning rate finder and MomentumLearner
0:40:10 - What’s happening is in each stride-2 convolution?
0:42:32 - Normalizing input matrix
0:46:09 - 85% accuracy
0:47:30 - Using with_transform to modify input data
0:48:18 - ReLU and 0 mean
0:52:06 - Changing the activation function
0:55:09 - 87% accuracy and nice looking training graphs
0:57:16 - “All You Need Is a Good Init”: Layer-wise Sequential Unit Variance
1:03:55 - Batch Normalization, Intro
1:06:39 - Layer Normalization
1:15:47 - Batch Normalization
1:23:28 - Batch Norm, Layer Norm, Instance Norm and Group Norm
1:26:11 - Putting all together: Towards 90%
1:28:42 - Accelerated SGD
1:33:32 - Regularization
1:37:37 - Momentum
1:45:32 - Batch size
1:46:37 - RMSProp
1:51:27 - Adam: RMSProp plus Momentum
Timestamps and transcript thanks to fmussari
Комментарии