Tutorial 14- Stochastic Gradient Descent with Momentum

preview_player
Показать описание
In this post I’ll talk about simple addition to classic SGD algorithm, called momentum which almost always works better and faster than Stochastic Gradient Descent. Momentum or SGD with momentum is method which helps accelerate gradients vectors in the right directions, thus leading to faster converging. It is one of the most popular optimization algorithms and many state-of-the-art models are trained using it. Before jumping over to the update equations of the algorithm, let’s look at some math that underlies the work of momentum.

Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!

You can buy my book on Finance with Machine Learning and Deep Learning from the below url

🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY YOUTUBE CHANNEL
Рекомендации по теме
Комментарии
Автор

You're doing really great. It's really good that you're focusing on the theory part and making it crisp clear for every one.

allenalex
Автор

I just love you, Krish. No need to search the Web, just Krish Naik is there to clear all the ideas. I like your approach of teaching theory first and then practical. Doing practical without clearing theory is useless. Thank you.

sukumarroychowdhury
Автор

Krish , you are doing really a great job. Even though I had completed my MSc. in Data Science and have some work experience, I am learning so much more from your tutorials. Lot of love. From Saudi Arabia 😃

story_teller_
Автор

if you confuse at 11:30 in SGD Momentum Equation, I will try to write again all equations.
Weight updated Formula
w2 = w1 - (learning_rate * dl/dw1)

define a new variable g1 = dl/dw1
and v1 = learning_rate* g1

so you can write your Weight updated Formula Again
w2 = w1 - v1

Again come to exponential moving Average Part
v1 = learning_rate* g1
v2 = gamma* v1 + (learning_rate* g2)
v_n = gamma* v_n-1 + (learning_rate* gn)

So final Equation will be
w_n = w_n-1 - v_n

Case1. If gamma value is 0 then
w_n = w_n-1 - learning_rate* gn

case 2. if gamma value is not 0
w_n = w_n-1 - v_n = w_n-1 - (gamma* v_n-1 + (learning_rate* gn))

shahrukhsharif
Автор

Understanding concept is very important, When i started deep learning, I was not able to understand any terminology . After watching your tutorial, I am able to correlate everything.. Thanks you so much..

pravinkaushikbsp
Автор

Yes, we need to understand the basic concepts and then we shall apply it practically, well organized lecture topics. Great keep going sir.

brindhasenthilkumar
Автор

Thank you for explaining SGD+Momentum. I have a much more intuitive understanding of the method now.

melodytune
Автор

Utmost respect looking for this theory and the way you explained it is just great

swapnilkushwaha
Автор

continue your work. The theoretical concept is very important. The practical implementations won't take much time.

abhishekkaushik
Автор

That was a great video.Hope my understanding continues till the end.Only need to know one thing.You don't have to remember all the things .Just know what is going on. THat's all.Thanks

sandipansarkar
Автор

You are amazing. Please do not stop making videos.

raminehlopezyazdani
Автор

Awesome videos:), I was always confuse with the momentum concept in the optimizer, now I am understanding it crystal and clear.

rishabhkumar-qsjb
Автор

Continue sir,

I'm understanding all this This is awesome

Thank you sir for this free educational video, this help mean a lot to us...
Keep continuing....
And I'm click ads so that you can get money in rewards.... 🙏

Artista
Автор

Awesome work dude. Really Like your videos..keep going

abhishekkaushik
Автор

Awesome Work Sir! your sequence of topics is very well organized

alikalair
Автор

SGD with momentum, in the last part at 11:30 min it should be V(t+1) coz we are predicting for future value and hence V(t) will be the recent known value.

adityachandra
Автор

I have been following this playlist n I don't wanna lie this whole tuto confused me really bart😂...

CoderX-mchv
Автор

Very well explained. Not seen any other tutorial with some much emphasis on foundation. Btw, your video is going out of focus at times, may be your camera is set on auto focus.

ranjithmadhavan
Автор

At 10:30, why the learning rate is not multiplied by the term \gamma V_t?

MohandAlbaz
Автор

Shouldn"t the last equation be V(t) instead of V(t-1)

darshmehta