filmov
tv
NN - 16 - L2 Regularization / Weight Decay (Theory + @PyTorch code)

Показать описание
In this video we will look into the L2 regularization, also known as weight decay, understand how it works, the intuition behind it, and see it in action with some pytorch code.
Become a member and get full access to this online course:
*** 🎉 Special YouTube 60% Discount on Yearly Plan – valid for the 1st 100 subscribers; Voucher code: First100 🎉 ***
"NN with Python" Course Outline:
*Intro*
* Administration
* Intro - Long
* Notebook - Intro to Python
* Notebook - Intro to PyTorch
*Comparison to other methods*
* Linear Regression vs. Neural Network
* Logistic Regression vs. Neural Network
* GLM vs. Neural Network
*Expressivity / Capacity*
* Hidden Layers: 0 vs. 1 vs. 2+
*Training*
* Backpropagation - Part 1
* Backpropagation - Part 2
* Implement a NN in NumPy
* Notebook - Implementation redo: Classes instead of Functions (NumPy)
* Classification - Softmax and Cross Entropy - Theory
* Classification - Softmax and Cross Entropy - Derivatives
* Notebook - Implementing Classification (NumPy)
*Autodiff*
* Automatic Differentiation
* Forward vs. Reverse mode
*Symmetries in Weight Space*
* Tanh & Permutation Symmetries
* Notebook - Tanh, Permutation, ReLU symmetries
*Generalization*
* Generalization and the Bias-Variance Trade-Off
* Generalization Code
* L2 Regularization / Weight Decay
* DropOut regularization
* Notebook - DropOut (PyTorch)
* Notebook - DropOut (NumPy)
* Notebook - Early Stopping
*Improved Training*
* Weight Initialization - Part 1: What NOT to do
* Notebook - Weight Initialization 1
* Weight Initialization - Part 2: What to do
* Notebook - Weight Initialization 2
* Notebook - TensorBoard
* Learning Rate Decay
* Notebook - Input Normalization
* Batch Normalization - Part 1: Theory
* Batch Normalization - Part 2: Derivatives
* Notebook - BatchNorm (PyTorch)
* Notebook - BatchNorm (NumPy)
*Activation Functions*
* Classical Activations
* ReLU Variants
*Optimizers*
* SGD Variants: Momentum, NAG, AdaGrad, RMSprop, AdaDelta, Adam, AdaMax, Nadam - Part 1: Theory
* SGD Variants: Momentum, NAG, AdaGrad, RMSprop, AdaDelta, Adam, AdaMax, Nadam - Part 2: Code
*Auto Encoders*
* Variational Auto Encoders
~~~~~ SUPPORT ~~~~~
~~~~~~~~~~~~~~~~~
Intro/Outro Music: Dreamer - by Johny Grimes
Become a member and get full access to this online course:
*** 🎉 Special YouTube 60% Discount on Yearly Plan – valid for the 1st 100 subscribers; Voucher code: First100 🎉 ***
"NN with Python" Course Outline:
*Intro*
* Administration
* Intro - Long
* Notebook - Intro to Python
* Notebook - Intro to PyTorch
*Comparison to other methods*
* Linear Regression vs. Neural Network
* Logistic Regression vs. Neural Network
* GLM vs. Neural Network
*Expressivity / Capacity*
* Hidden Layers: 0 vs. 1 vs. 2+
*Training*
* Backpropagation - Part 1
* Backpropagation - Part 2
* Implement a NN in NumPy
* Notebook - Implementation redo: Classes instead of Functions (NumPy)
* Classification - Softmax and Cross Entropy - Theory
* Classification - Softmax and Cross Entropy - Derivatives
* Notebook - Implementing Classification (NumPy)
*Autodiff*
* Automatic Differentiation
* Forward vs. Reverse mode
*Symmetries in Weight Space*
* Tanh & Permutation Symmetries
* Notebook - Tanh, Permutation, ReLU symmetries
*Generalization*
* Generalization and the Bias-Variance Trade-Off
* Generalization Code
* L2 Regularization / Weight Decay
* DropOut regularization
* Notebook - DropOut (PyTorch)
* Notebook - DropOut (NumPy)
* Notebook - Early Stopping
*Improved Training*
* Weight Initialization - Part 1: What NOT to do
* Notebook - Weight Initialization 1
* Weight Initialization - Part 2: What to do
* Notebook - Weight Initialization 2
* Notebook - TensorBoard
* Learning Rate Decay
* Notebook - Input Normalization
* Batch Normalization - Part 1: Theory
* Batch Normalization - Part 2: Derivatives
* Notebook - BatchNorm (PyTorch)
* Notebook - BatchNorm (NumPy)
*Activation Functions*
* Classical Activations
* ReLU Variants
*Optimizers*
* SGD Variants: Momentum, NAG, AdaGrad, RMSprop, AdaDelta, Adam, AdaMax, Nadam - Part 1: Theory
* SGD Variants: Momentum, NAG, AdaGrad, RMSprop, AdaDelta, Adam, AdaMax, Nadam - Part 2: Code
*Auto Encoders*
* Variational Auto Encoders
~~~~~ SUPPORT ~~~~~
~~~~~~~~~~~~~~~~~
Intro/Outro Music: Dreamer - by Johny Grimes
Комментарии