23. Accelerating Gradient Descent (Use Momentum)

preview_player
Показать описание
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang

In this lecture, Professor Strang explains both momentum-based gradient descent and Nesterov's accelerated gradient descent.

License: Creative Commons BY-NC-SA
Рекомендации по теме
Комментарии
Автор

Jesus man, I remember back before I started college when I checked out Prof Strang’s calculus series.
He’s aged quite a lot since that series, but he’s always sharp as a tack. And I’m just astonished that even being so old he knows so much about machine learning, I didn’t think it was his field.
Huge kudos Gilbert Strang, huge kudos.

gigik
Автор

Such a great lecturer, as well as in his classic Linear Algebra lecture series. Really nice to see him up and healthy, sharp and as a great step-by-step-explainer as ever.

franzdoe
Автор

Professor Strang, thank you for an old fashion lecture on Accelerating Gradient Descent.
These topics are very theoretical for the average student.

georgesadler
Автор

Why is there no more comments for such a great course? MIT is a great university!

dengdengkenya
Автор

I'm so happy to see you here. I only trust you when it comes to lecture

nguyenbaodung
Автор

Wow this old man is so smart. I would wish to see more lectures from him and learn much more of this stuff.

marjavanderwind
Автор

He radiates knowledge. Love the content!

honprarules
Автор

Those who have sixth edition of Introduction to Linear Algebra can enjoy this course!!! In my view this course really increases the value of the book.

Arin
Автор

I loved this amazing lecture. Great professor, and great content. Thanks for sharing it openly on YouTube.

MsVanessasimoes
Автор

Prof Boyd is also very good teacher !
I enjoy his lecture very much.

何浩源-ry
Автор

Finally a lecture that explains the magic numbers in momentum! Those shorter video formats are great for introduction but leave me confused about the math behind it. Love the ground up approach to explaining.

Could any one tell me what the book that Professor Strang mentioned in 06:53 of the lecture is?

casual_dancer
Автор

At 27:00 why follow the direction of eigenvalue? It just comes out of no where

vnpikachu
Автор

such great lecturing makes me wonder what part of MIT student success is due to innate ability and how much due to superior teaching

vaisuliafu
Автор

Crystal clear! Thank you very much for sharing it

antaresd
Автор

It’s nice you got it on a linear line.

brendawilliams
Автор

why is it enough to assume x follows an eigenvector to demonstrate the rate of convergence?

Schweini
Автор

Tough course to follow, from what I feel (I'm currently in my 4th semester of undergrad)
Great lecture of Prof Gilbert, I feel kinda dumb after listening to this lecture, will try again

newbie
Автор

wow, beautiful, now i see why it oscillates

meow
Автор

why do we need to make the eigen vector as small as possible ?

vishalpoddar
Автор

Can this procedure be expanded to deal with problems in multiple dimensions? So a, b, c, and d are not scalars but actually vectors themselves, representing the inputs x1, x2, x3 to a function f(x1, x2, x3). How would you form R that way, and would you have different condition numbers for each element of b?

alessandromarialaspina