Momentum Optimizer in Deep Learning | Explained in Detail

Показать описание

In this video, we will understand in detail what is Momentum Optimizer in Deep Learning.

Momentum Optimizer in Deep Learning is a technique that reduces the time taken to train a model.

The path of learning in mini-batch gradient descent is zig-zag, and not straight. Thus, some time gets wasted in moving in a zig-zag direction. Momentum Optimizer in Deep Learning smooth out the zig-zag path and make it much more straighter, thus reducing the time taken to train the model.

Momentum Optimizer uses Exponentially Weighted Moving Average, which averages out the vertical movement and the net movement is mostly in the horizontal direction. Thus zig-zag path becomes straighter.

In this video, we will also understand what Exponentially Weighted Moving Average is, and thus this video is a full in-depth explanation of Momentum Optimizer in Deep Learning.

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

Timestamp:
0:00 Agenda
1:00 Why do we need Momentum?
2:53 Exponentially Weighted Moving Average
8:29 Momentum in Mini Batch Gradient Descent
9:50 Why Momentum works?

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

Рекомендации по теме

Комментарии

This is EXACTLY HOW I needed to learn: Maths + Visualization with equations! Thank you so much!

anuranroy

Very few of them had even explained what momentum is made up of, whats its equation, you just took 2 mins to add that explanation but it helped so much to understand rest 10 mins of the video without pausing. Great work. Please keep it up.

pranaysingh

Very few resources in the internet explain these concepts in this kind of depth and clearly. Either they are in depth but not understandable or clear but not in depth. Loved your explanation.

bijoyroy

Your lectures are very short and easy to understand. Hope you will make more videos like this about optimization algorithms in deep learning.Thank you very useful video

minhnhat

Very nice explanation, thank you....From scratch, mathematics is what I was looking for....This really helped!!

chinmaysoni

this was exactly what I was seeking for. Thanks a

amirrezasadeghi

Greatly explained ! Thank you !! ( I find it even better than Andrew's one on the momentum), Keep it up !!

redalamphd

On 5:27 when computing V3, aren't you missing the factor (1-beta) from V2?

olgaptacek

This is the best explanation. Thank you

ghilesdjebara

You're a great man dude! Thanks alot.

pranaysingh

At 7:00 I think the difference formula is supposed to V(t) = Beta*Theta(t) + (1-Beta)*V(t-1) rather than V(t) = Beta*V(t-1) + (1-Beta)*Theta(t). Am I seeing that correctly?

careyshane

oh my god, this was clearly explained, thanks for this perfect insight.

melikakeshavarz

3:09 you are saying we give higher weightage to new points and low weightage to old points

but at 7:47, you are saying something opposite of it

so a confusion in this
I will appreciate if you can resolve this

Ankit-hsnb

Wonderful video. Made the concept look very easy...

aashwinsharma

very informative video brother, Thank you very much for the explanation, It was great

mohamedmohudoom

What would be the difference between this and adadelta?

mamahuhu_one

Thank you for a detailed video! I'm not an expert in this area. Could you explain what are W and B? From my understanding, W is the vector of parameters in the cost function, e.g., we want to minimize f(W). Is that correct? If so, what is B? How is it different from W? Thanks!

OnTastySpots

u are so good but some of ur vudeos has no subtitle caption unavailabe please active that for all ur videos tnx a lot

zshahlaie

Very well explained sir. Can you please start a playlist of DSA for python?

anirbanrana

I want to contact you for business work

alidakhil

Momentum Optimizer in Deep Learning | Explained in Detail

Optimization for Deep Learning (Momentum, RMSprop, AdaGrad, Adam)

Momentum Optimizer in Deep Learning | Explained in Detail

Gradient Descent With Momentum (C2W2L06)

Accelerate Gradient Descent with Momentum (in 3 minutes)

Gradient descent with momentum

Optimizers - EXPLAINED!

SGD with Momentum Optimizer || Lesson 13 || Deep Learning || Learning Monkey ||

Tutorial 14- Stochastic Gradient Descent with Momentum

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

6. Momentum Optimizer and Nesterov Accelerated Gradient Optimizer | Deep Learning | Machine Learning

SGD with Momentum Explained in Detail with Animations | Optimizers in Deep Learning Part 2

Applying the Momentum Optimizer to Gradient Descent

Lecture 43 : Optimisers: Momentum and Nesterov Accelerated Gradient (NAG) Optimiser

How to select the correct optimizer for Neural Networks

23. Accelerating Gradient Descent (Use Momentum)

Optimizer in deep learning.#NeuralNetworks #ML #MachineLearningModels #ReinforcementLearning

Stochastic Gradient Descent SGD with momentum Optimizer

Adam Optimizer Explained in Detail | Deep Learning

Optimization in Deep Learning | All Major Optimizers Explained in Detail

Gradient Descent in 3 minutes

Adam Optimization Algorithm (C2W2L08)

Deep Learning-All Optimizers In One Video-SGD with Momentum,Adagrad,Adadelta,RMSprop,Adam Optimizers

L12.3 SGD with Momentum

66 Gradient Descent with Momentum Optimization