Optimization in Deep Learning | All Major Optimizers Explained in Detail

preview_player
Показать описание
In this video, we will understand all major Optimization in Deep Learning. We will see what is Optimization in Deep Learning and why do we need them in the first place.

Optimization in Deep Learning is a difficult concept to understand, so I have done my best to provide you with the best possible explanation after studying it from different sources, so that you can understand it with ease.

So I hope after watching this video, you do struggle with the concept and you can understand it well.

Optimization in Deep Learning is a technique that speeds up the training of the model.

If you know about mini-batch gradient descent then you will know, that in mini-batch gradient descent, the learning takes place in a zig-zag manner. Thus, some time gets wasted in moving in a zig-zag direction instead of a straight direction.

Optimization in Deep Learning reduces makes the learning path straighter and thus reducing the time taken to train the model

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

Timestamp:
0:00 Agenda
1:02 Why do we need Optimization in Deep Learning
2:36 What is Optimization in Deep Learning
3:43 Exponentially Weighted Moving Average
9:20 Momentum Optimizer Explained
11:53 RMSprop Optimizer Explained
15:36 Adam Optimizer Explained

➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

Рекомендации по теме
Комментарии
Автор

Dude, i for the longest time felt like my understanding of moving averages and RmsProp was missing something and i found it in this video. You have no idea how grateful i am to your channel. Thank you, teachers tend to jump over important concepts without explaining them 🎉

ShuaibGass
Автор

Explained it best. After many years finally got it

azharhussian
Автор

Your explanations are as always clear and very useful. One of the best on YouTube, and again I say it as a teacher myself (not in the AI field).

Yet, an issue is being miss-explained in literally all available explanations on YouTube.
For some reason, you are also among them.
The issue is that you forget to mention that the loss surface is unique and different for EVERY observation and might potentially have minimums in different places for different observations. This is extremely important to understand especially in the context of stochastic gradient descent

igorg
Автор

Best explanation out there. Thanks a lot!

stylish
Автор

This is the best explanation so far. Thanks for the great work

ekleanthony
Автор

Thank you very much for the awesome explanation.

areegfahad
Автор

a life saver! thank you soo much for sharing this with us

daniapy
Автор

You explained it very smooth and clear, thank you!

daymatters
Автор

You made it super easy. Thanks for sharing

DelightDomain_DB
Автор

A small doubt, in the RMSprop as W and B both are affected by the SEWMA then the B direction is also affected by db, thus the steps in the B direction will also get checked, so won't it too decrease the learning rate and defy our whole purpose?

shubhamsinghal
Автор

Very nice explanation..Keep up the good work

antonykahuro
Автор

i still dont understand one thing what does B mean here is it the direction or bias

rohithdasari