Weight Initialization explained | A way to reduce the vanishing gradient problem

preview_player
Показать описание
Let's talk about how the weights in an artificial neural network are initialized, how this initialization affects the training process, and what YOU can do about it!

To kick off our discussion on weight initialization, we're first going to discuss how these weights are initialized, and how these initialized values might negatively affect the training process. We'll see that these randomly initialized weights actually contribute to the vanishing and exploding gradient problem we covered in the last video.

With this in mind, we'll then explore what we can do to influence how this initialization occurs. We'll see how Xavier initialization (also called Glorot initialization) can help combat this problem. Then, we'll see how we can specify how the weights for a given model are initialized in code using the kernel_initializer parameter for a given layer in Keras.

Reference to original paper by Xavier Glorot and Yoshua Bengio:

🕒🦎 VIDEO SECTIONS 🦎🕒

00:30 Help deeplizard add video timestamps - See example in the description
09:42 Collective Intelligence and the DEEPLIZARD HIVEMIND

💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥

👋 Hey, we're Chris and Mandy, the creators of deeplizard!

👉 Check out the website for more learning material:

💻 ENROLL TO GET DOWNLOAD ACCESS TO CODE FILES

🧠 Support collective intelligence, join the deeplizard hivemind:

🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order
👉 Use your receipt from Neurohacker to get a discount on deeplizard courses

👀 CHECK OUT OUR VLOG:

❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
Tammy
Mano Prime
Ling Li

🚀 Boost collective intelligence by sharing this video on social media!

👀 Follow deeplizard:

🎓 Deep Learning with deeplizard:

🎓 Other Courses:

🛒 Check out products deeplizard recommends on Amazon:

🎵 deeplizard uses music by Kevin MacLeod

❤️ Please use the knowledge gained from deeplizard content for good, not evil.
Рекомендации по теме
Комментарии
Автор

Machine Learning / Deep Learning Tutorials for Programmers playlist:

Keras Machine Learning / Deep Learning Tutorial playlist:

Data Science for Programming Beginners playlsit:

deeplizard
Автор

God sent you to help machine learning learners.

golangshorts
Автор

This series is A hidden gem ! You deserve more views !! Thank you

shauryr
Автор

You have put together a great, concise accessible series for the uninitiated in the field of Deep Machine Learning. Thank you!

Iamine
Автор

0:22 intro
0:54 how weights matter
2:48 bad weights can cause vanishing gradient
4:44 heuristic for initial weight
7:24 keras code

CosmiaNebula
Автор

Just amazing!
I'll always be thankful to you for providing us these astounding videos!

aaryannakhat
Автор

This series is really awesome. Thanks a lot for the detailed explanations. This certainly took a lot of effort so I ( and I guess everyone else who is watching and subscribing to this) highly appreciate that. Could you also cover the other layers in keras e.g. Embedding, Timedistributed, ...? :)

DanielWeikert
Автор

have never come across a video which explains this concept so well. awesome!

manjeetnagi
Автор

WOOW I love the new style! PLEASE add more videos!
your way of explanation is so clear.
Thanks.
KEEP IT UP!

Waleed-qveg
Автор

Some explanatory remarks on the equation var(weights) = 2 / (n_in + n_out):
The mentioned var(weights) = 1/n (or = 2/n when using ReLU) turned out to be good for the forward pass of the input (magnitude of the activations are kept approximately constant). However, with this the problem of vanishing gradients still exists during backpropagation. From the perspective of backpropagation it would be ideal to have var(weights) = 1 / n_{nextLayer}. Thus, in an article by Glorot and Bengio (2010) a compromise was found: var(weights) = 2 / (n + n_{nextLayer}), whereas n / n_{nextLayer} is the number of neurons in the layer before / after a considered neuron.


This is from the lecture I've attended on deep neural networks.

torgoron
Автор

The explanation is so perfect and clear that there are no dislikes in the video!!. Loved your voice and way of explanation :)

raviteja
Автор

Thank you very much! Now I understand better what goes on with Xavier initialization!

fernandobaladi
Автор

Thank you very much for this video! I really enjoyed learning about initialization and how it connects with everything else! Great that it can be used to tackle the vanishing/exploding gradient problem!

tymothylim
Автор

I love your voice, it's so soothing to listen to.

asdfasdfuhf
Автор

Amazing video as always! Thank you for your contribution to the machine learning community, it's very valuable and we learn a lot from you.

parismollo
Автор

Your angel to me thanks for saving my time and also for reducing my stress levels to understand the concepts

Viralvlogvideos
Автор

This youtube Channel is a blessing from god, if there is one :D. By sharing your knowledge in such a easy way, you are seriously doing so much GOOD. Thank you <3

moritzpainz
Автор

This playlist is really good. Grateful to you for your effort! :)

prashanthvaidya
Автор

The explanation given is great. I was expecting things much in deep but that's fine. Now I got the clarity on what I need to dissect and I will definitely explore the content in this channel. Love and respect from India. Keep up the good work :)

harishpawar
Автор

Loved this❤️
Please make a video on how ResNet helps in solving the problem of vanishing and exploding

saileshpatra