The Evolution of Gradient Descent

Показать описание

Which optimizer should we use to train our neural network? Tensorflow gives us lots of options, and there are way too many acronyms. We'll go over how the most popular ones work and in the process see how gradient descent has evolved over the years.

Code from this video (with coding challenge):

Please subscribe! And like. And comment. Thats what keeps me going.

More learning resources:

Join us in the Wizards Slack channel:

Follow me:
Signup for my newsletter for exciting updates in the field of AI:

Рекомендации по теме

Комментарии

I'm graduating in Computer Science Masters, my research is NLP (use a lot of RNNs) but your videos always give me small insights that help me understand deep learning more "deeply" haha. Just wanted to say that, you're great Siraj!

omarch

Hi Siraj I got a chance to watch few of your video. I am 8 years Ml researcher but I found you teaching method is awesome.
anyone could learn. Great work

ananthraj

These videos are definitely getting better.

sz

Thank you so much for this video! I was just about to start researching the differences between the SGD optimization algorithms. Thank you so much for saving me so much time and making a video that has all the pertinent information in a very informative and understandable way. I love your videos so much. Thank you, Siraj, you are my favorite person on the internet. Don't stop what you're doing. You're helping so many people learn so much information that can be sometimes hard to find. Thanks!!!

ryancooper

Your last few videos have been so on point! Very interesting things that are useful for someone who already knows a decent amount of ML and NNs, but not NNs so deeply.

rasen

Aye what an explanation man, big ups, you make an already interesting topic way more interesting. Thanks Siraj!

skatinho

awesome video man. Never seen a guy explain something so technical in such an ebullient way!!

stftcalculations

Bro, That was Awesome when you said that "ohh Gradient Descent lead us to convergence!!".

RahulSingh-xjry

You are improving a lot in you presentation style Siraj! Talking slower and clearer is really working for your material. Great work👍

TheNiklas

you're the best. I wasn't able to understand these concepts (reading these overly complicated articles) but now it's becoming clear. Thanks a lot. For example, I realized that adam is the best solver after weeks of gridsearch tests, but I didn't know why.... and now it's clear.

deniscandido

I can't believe how useful is this video. Rad! Thanks Siraj

martonveto

This video has way fewer views than it should have...
I really hope that more people will find you and your great content!

Isti

Siraj is a robot. His videos keep getting better and better.

Schmuck

Great videos Siraj. Keep up the awesome work

jeremysender

@2:30 Siraj, you should change that graphic of the function y=x^2. The function you have shown there is not x^2 and could confuse people. You're talking about decreasing the x value in the negative direction of the gradient from x=2.3 to x=1.4 to x=0.7, basically moving from a high x on the right towards smaller x values on the left. This is decreasing the x value, yet the graphic you have up there is showing moving right from the left. Some newbies may be confused by that. But great vid overall.

RedShipsofSpainAgain

Fantastic video. Keep up the great work Siraj.

jony

Yes more evolution of DL algorithms pls. Its really hard to decide which algo to use in which situation most of the time! Thanks Siraj for these great videos

suprotikdey

Hi Siraj,

just wanted to complement everything that you do but the last three videos in particular - the pacing and the overviews were awesome (usually your videos are a bit too fast for me, and I have to go over and over...) :) And a question - I am working with Keras right now (it is just sooo much easier and intuitive compared to TF, for which in 5 tutorials I see 5 different coding approaches and TF parameters used for effectively the exactly same network) and thought about 2 options for deploying:

1. export model and weights, load them in TF, and do everything according to your video
2. save model and weights, and make a small script in Keras that loads the model and does prediction.

Thoughts?
Thanks!

centar

Holy shit Siraj, the video quality has gotten so amazing. :)

noneofyourbusiness

You're awesome. Thanks for making these videos!! They really help and are entertaining as well.

embiem_

The Evolution of Gradient Descent

The Evolution of Gradient Descent

Gradient Descent vs Evolution | How Neural Networks Learn

STOCHASTIC Gradient Descent (in 3 minutes)

Gradient Descent vs Evolution

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

Gradient flow

NN in Arabic | The Evolution of Gradient Descent (Week 03)

'Correspondence between neuroevolution and gradient descent,' by S. Whitelam et al.

Natural Evolution Strategy (NES) part 1: Evolution gradient; the log-likelihood trick

Journey to the depths of the Loss Landscape | Deep Learning Gradient Descent Visualizations

WHAT IS A NEURAL NETWORK?? - Gradient Descent

VISUALISER L'EVOLUTION DE SON MODÈLE DE MACHINE LEARNING

DESCENTE DE GRADIENT (GRADIENT DESCENT) - ML#4

Trajectory of the gradient descent: a 2D illustration

Gradient descent, Rosenbrock function (Nelder)

An RNA evolutionary algorithm based on gradient descent for function optimization

Gradient descent, Rosenbrock function (TNC)

#28 Machine Learning Specialization [Course 1, Week 2, Lesson 2]

Optimization Algorithms

Analyzing Optimization and Generalization in Deep Learning via Trajectories of Gradient Descent

Trajectory of the stochastic gradient descent: a 2D illustration

Avi Wigderson 'Optimization, Complexity and Math (or, can we prove P ≠ NP using gradient descen...

Stochastic Gradient Descent: boosting its performances in R

Analyzing Optimization and Generalization in Deep Learning via Trajectories of Gradient Descent