Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

preview_player
Показать описание
Welcome to our deep dive into the world of optimizers! In this video, we'll explore the crucial role that optimizers play in machine learning and deep learning. From Stochastic Gradient Descent to Adam, we cover the most popular algorithms, how they work, and when to use them.

🔍 What You'll Learn:

Basics of Optimization - Understand the fundamentals of how optimizers work to minimize loss functions

Gradient Descent Explained - Dive deep into the most foundational optimizer and its variants like SGD, Momentum, and Nesterov Accelerated Gradient

Advanced Optimizers - Get to grips with Adam, RMSprop, and AdaGrad, learning how they differ and their advantages

Intuitive Math - Unveil the equations for each optimizer and learn how it stands out from the others

Real World Benchmarks - See real world experiments from papers in domains ranging from computer vision to reinforcement learning to see how these optimizers fare against each other

🔗 Extra Resources:

📌 Timestamps:
0:00 - Introduction
1:17 - Review of Gradient Descent
5:37 - SGD w/ Momentum
9:26 - Nesterov Accelerated Gradient
10:55 - Root Mean Squared Propagation
13:59 - Adaptive Gradients (AdaGrad)
14:47 - Adam
18:12 - Benchmarks
22:01 - Final Thoughts

Stay tuned and happy learning!
Рекомендации по теме
Комментарии
Автор

The Adam Optimizer is a very complex topic that you introduced and explained in a very well manner and in a surprisingly short video! I'm impressed Sourish! Definetly one of my favorite videos from you!

akshaynaik
Автор

I remembered when my teacher gave me assignment on optimizers I have gone through blogs, papers and videos but everywhere I see different formulas I was so confused but you explained everything at one place very easily.

AbhishekVerma-kjhd
Автор

Absolutely loved the graphics and intensive paper based proof of working of different optimizers, all in the same video. You just earned a loyal viewer.

theardentone
Автор

I don't comment to videos a lot ... but I just wanted to let you know this is the best visualization and explanation on optimizers I've found on youtube . Great Job .

amaniworks
Автор

love the simplified explanation and animation! videos with this quality and educational value are worth of millions of likes and subscribers in other channels... this is so underrated..

razever
Автор

Nice animations, nice explanations of the math mehind them, i was curious about how different optimizers work but didnt want to spend an hour going through documentations, this video answered most of my questions!

One that remains is about the AdamW optimizer, i read that it is practically just a better version of Adam, but didnt really find any intuitive explanations of how it affects training (ideally with graphics like these hahaha). There are not many videos on youtube about it

TEGEKEN
Автор

Didn't expect to have to learn three optimizers in order to understand Adam, but here we are. It took me so much time to go through this video, and I had to have Chat GPT explain those formulas a bit more in depth before it slowly started to make sense. But I think I've got the (math) intuition behind it now. Thanks for this video, lots of others skip the math but like, you cannot understand without the math because ML IS math, right?! Btw the visualizations were pretty great!

rigelr
Автор

Sir your exposition is excellent, the presentation, the cadence, the simplicity.

OpsAeterna
Автор

This video is amazing!
You covered most important topic in ML, with all major optimization algorithms. I literally had no idea about Momentum, NAG, RMSprop, AdaGrad, Adam.

Now, I have a good overview of all, will deep dive in all of them.

Thanks for the video! ❤

Param
Автор

Very Clear Explanation! Thank you. I especially appreciate the fact that you included the equations.

aadilzikre
Автор

Thank you for such easy, simple, and great explanation. I searched quick overwiev how Adam is working and found your video. Actually I am training DRL Reinforce Policy Gradient algorithm with theta parameters as weights and viases from CNN, where exactly Adam is involved. Thanks again, very informative.

AndBar
Автор

Just found out your channel. Instant follow 🙏🏼 Hope we can see more Computer Science content like this. Thank you ;)

ai_outline
Автор

Very nicely explained. Wish you brought up the relationship between these optimizers and numerical procedures though. Like how vanilla gradient descent is just Euler's method applied to a gradient rather than one derivative.

jeremiahvandagrift
Автор

Nice vid, I'd mention MAS too, to explicity say that Adam at the start is weaker and could fit local minima(until it gets enough data) and SGD peforms well with its stochasity, and then slower, so both methods (peformed nearly like I mentioned in MAS Paper)

orellavie
Автор

Wow! Great video, more of these deep dives into basic components of ML please

wutvr
Автор

Thanks for the great explanations! The graphics and benchmark were particularly useful.

MD-zddu
Автор

I used to have networks where the loss was fluctuating in a very periodic manner every 30 or so steps and I never knew why that happened. Now it makes sense! It just takes a number of steps for the direction of Adam weight updates to change.
I really should have looked this up earlier.

Alice_Fumo
Автор

Nice video :) I appreciate the visual examples of the various optimizers.

MalTramp
Автор

You are incredibly intelligent to explain such a complex topic formed of tens of research papers of knowledge in a single 20 minutes video... what the heck!

nark
Автор

Excellent video, please keep it up! Subscribed and will share with my colleagues too :)

alexraymond