Weight Initialization explained | A way to reduce the vanishing gradient problem

Показать описание

Let's talk about how the weights in an artificial neural network are initialized, how this initialization affects the training process, and what YOU can do about it!

To kick off our discussion on weight initialization, we're first going to discuss how these weights are initialized, and how these initialized values might negatively affect the training process. We'll see that these randomly initialized weights actually contribute to the vanishing and exploding gradient problem we covered in the last video.

With this in mind, we'll then explore what we can do to influence how this initialization occurs. We'll see how Xavier initialization (also called Glorot initialization) can help combat this problem. Then, we'll see how we can specify how the weights for a given model are initialized in code using the kernel_initializer parameter for a given layer in Keras.

Reference to original paper by Xavier Glorot and Yoshua Bengio:

🕒🦎 VIDEO SECTIONS 🦎🕒

00:30 Help deeplizard add video timestamps - See example in the description
09:42 Collective Intelligence and the DEEPLIZARD HIVEMIND

💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥

👋 Hey, we're Chris and Mandy, the creators of deeplizard!

👉 Check out the website for more learning material:

💻 ENROLL TO GET DOWNLOAD ACCESS TO CODE FILES

🧠 Support collective intelligence, join the deeplizard hivemind:

🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order
👉 Use your receipt from Neurohacker to get a discount on deeplizard courses

👀 CHECK OUT OUR VLOG:

❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
Tammy
Mano Prime
Ling Li

🚀 Boost collective intelligence by sharing this video on social media!

👀 Follow deeplizard:

🎓 Deep Learning with deeplizard:

🎓 Other Courses:

🛒 Check out products deeplizard recommends on Amazon:

🎵 deeplizard uses music by Kevin MacLeod

❤️ Please use the knowledge gained from deeplizard content for good, not evil.

Рекомендации по теме

Комментарии

Machine Learning / Deep Learning Tutorials for Programmers playlist:

Keras Machine Learning / Deep Learning Tutorial playlist:

Data Science for Programming Beginners playlsit:

deeplizard

God sent you to help machine learning learners.

golangshorts

This series is A hidden gem ! You deserve more views !! Thank you

shauryr

You have put together a great, concise accessible series for the uninitiated in the field of Deep Machine Learning. Thank you!

Iamine

0:22 intro
0:54 how weights matter
2:48 bad weights can cause vanishing gradient
4:44 heuristic for initial weight
7:24 keras code

CosmiaNebula

Just amazing!
I'll always be thankful to you for providing us these astounding videos!

aaryannakhat

This series is really awesome. Thanks a lot for the detailed explanations. This certainly took a lot of effort so I ( and I guess everyone else who is watching and subscribing to this) highly appreciate that. Could you also cover the other layers in keras e.g. Embedding, Timedistributed, ...? :)

DanielWeikert

have never come across a video which explains this concept so well. awesome!

manjeetnagi

WOOW I love the new style! PLEASE add more videos!
your way of explanation is so clear.
Thanks.
KEEP IT UP!

Waleed-qveg

Some explanatory remarks on the equation var(weights) = 2 / (n_in + n_out):
The mentioned var(weights) = 1/n (or = 2/n when using ReLU) turned out to be good for the forward pass of the input (magnitude of the activations are kept approximately constant). However, with this the problem of vanishing gradients still exists during backpropagation. From the perspective of backpropagation it would be ideal to have var(weights) = 1 / n_{nextLayer}. Thus, in an article by Glorot and Bengio (2010) a compromise was found: var(weights) = 2 / (n + n_{nextLayer}), whereas n / n_{nextLayer} is the number of neurons in the layer before / after a considered neuron.

This is from the lecture I've attended on deep neural networks.

torgoron

The explanation is so perfect and clear that there are no dislikes in the video!!. Loved your voice and way of explanation :)

raviteja

Thank you very much! Now I understand better what goes on with Xavier initialization!

fernandobaladi

Thank you very much for this video! I really enjoyed learning about initialization and how it connects with everything else! Great that it can be used to tackle the vanishing/exploding gradient problem!

tymothylim

I love your voice, it's so soothing to listen to.

asdfasdfuhf

Amazing video as always! Thank you for your contribution to the machine learning community, it's very valuable and we learn a lot from you.

parismollo

Your angel to me thanks for saving my time and also for reducing my stress levels to understand the concepts

Viralvlogvideos

This youtube Channel is a blessing from god, if there is one :D. By sharing your knowledge in such a easy way, you are seriously doing so much GOOD. Thank you <3

moritzpainz

This playlist is really good. Grateful to you for your effort! :)

prashanthvaidya

The explanation given is great. I was expecting things much in deep but that's fine. Now I got the clarity on what I need to dissect and I will definitely explore the content in this channel. Love and respect from India. Keep up the good work :)

harishpawar

Loved this❤️
Please make a video on how ResNet helps in solving the problem of vanishing and exploding

saileshpatra

Weight Initialization explained | A way to reduce the vanishing gradient problem

Weight Initialization for Deep Feedforward Neural Networks

Weight Initialization explained | A way to reduce the vanishing gradient problem

Weight Initialization in a Deep Network (C2W1L11)

L11.6 Xavier Glorot and Kaiming He Initialization

Tutorial 11- Various Weight Initialization Techniques in Neural Network

L11.5 Weight Initialization -- Why Do We Care?

Weight Initialization techniques In Neural Network|How to initialize weight in a deep neural network

Xavier/Glorat And He Weight Initialization in Deep Learning

Day 2 - Workshop on 'From Scratch to Deployment: Hands-On Model Building with Keras and TensorF...

Introduction to Weight Initialization

Weight Initialization and Regularization Techniques for NNs

L11.7 Weight Initialization in PyTorch -- Code Example

Tutorial 8 - Weights Initialization Techniques in Neural Networks

Tutorial 17- Create Artificial Neural Network using Weight Initialization Tricks

Random Initialization (C1W3L11)

Tutorial 98 - Deep Learning terminology explained - Kernel (weights) initialization and padding

Initialization

Weight Initialization Techniques | What not to do? | Deep Learning

L8/2 Stabilize Training - Weight Initialization

NN - 19 - Weight Initialization 2 - What to do? Xavier Glorot & Kaiming He inits

Deep Learning for Python Developers : Weight Initialization for Deep Networks | packtpub.com

NN - 18 - Weight Initialization 1 - What not to do?

Pytorch Quick Tip: Weight Initialization

Weight initialization