22. Gradient Descent: Downhill to a Minimum

Показать описание

MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang

Gradient descent is the most common optimization algorithm in deep learning and machine learning. It only takes into account the first derivative when performing updates on parameters - the stepwise process that moves downhill to reach a local minimum.

License: Creative Commons BY-NC-SA

Рекомендации по теме

Комментарии

Professor Strang thank you for a straight forward lecture on Gradient Descent: Downhill to a Minimum and its relationship with convex function. The examples are important for deep understanding of this topic in numerical linear algebra.

georgesadler

Very clear and natural to follow the lesson. Thank Professor Strang so much. Btw, his books are also very wonderful.

tungohoang

Very natural way of teaching. Thank you sir

Musabbir_Sakib

Wow the video quality is awesome
and lecture of Professor Gilbert Strang is the best

satyamwarghat

Your lectures are a pleasure to watch (and learn from)!

mkelly

If in calc 1 they introduced the term argmin for the place where the minimum occurred there would be less confusion as students often mistake argmin for the actual min.

TheRsmits

his picture with grad(f) pointing up is a bit misleading around 9:00 I think. grad(f) is a vector in the x-y plane, pointing in the direction you should move in the x-y plane to maximize increase in f.

martinspage

Omg look at how clean those top boards are 🤩

naterojas

why in 42:25 insn't the gradient [x, by] since there is 1/2 multiplied at f?

에헤헿-lv

what a beautiful functions. that's why i love linear algebra.

kirinkirin

Absolutely well done and definitely keep it up!!! 👍👍👍👍👍👍

brainstormingsharing

How did he get all the equations of xk, yk and fk at 45:35? Specifically, how did he get (b-1)/(b+1) and vice versa? I shifted the equation to make xk+1 and yk+1 the subjects of the equations but instead I got xk+1 = xk (1-2sk), where xk = x0 = b.

samuelyeo

Around 40:27. Does anybody know how to derive the result of reduction rate of m/M (the condition number)? Any tip or reference?

RC.

I think the grad(f) at 16:00 should be 0.5(S+S tranpose)x-a, right? anyway, thank you for the amazing lecture!

HieuLe-unll

Isnt grad(f) supposed to be [x by] instead of [2x 2by]?

gopalkulkarni

I am hoping for a discussion about conventions of derivatives. Much of the stuff I've seen would make the gradient a row vector, which leads to the derivatives being the transpose of what he shows. In his example, the derivative a'x is 'a' which is contrary to intuition from single variable calculus though he uses intuition for x'Sx.

finweman

For those who have not taken the previous 22 lecture, This lecture wont much help them.

TheNeutralGuy

In 26:51, the professor writes Gradient(f) = entries of X^-1. Do anyone know how to get that equation? Thanks!

shenzheng

He is too long winded. Why not use a simple function of x, y. Find the derivatives and start dong a few iteration. Finally he gets to gradient descent. Gradient descent works but the are better algorithms. The line search idea is a good start. WTF is wrong with this guy? A simple python program or even excel would be much more meaningful. Thumbs down.

pnachtwey

This sounds like a bunch of non sense.

John-wxzn

22. Gradient Descent: Downhill to a Minimum

22. Gradient Descent: Downhill to a Minimum

Intro to Gradient Descent || Optimizing High-Dimensional Equations

Gradient Descent in 3 minutes

Gradient Descent, Step-by-Step

Gradient Descent with momentum and Steepest Descent

Gradient Descent II

Easiest Way to Understand Gradient Descent Step by Step (downhill to a minimum with derivatives)

gradientDescent.m Gradient Descent Implementation - Machine Learning

Gradient Descent

Gradient Descent Visualised #mathematicsformachinelearning #deeplearning #gradientdescent

Gradient descent

Gradient Descent Explained

PyTorch Lecture 03: Gradient Descent

EP5: Convergence in Gradient Descent

Gradient Descent

Introduction to Optimization . Part 3 - Gradient-Based Optimization

Gradient descent method (steepest descent)

Gradient Descent Method - EXCEL/VBA

ML in 3 minute - Gradient Descent

Gradient Descent on Non-Convex Functions | Data Science Interview Questions | Machine Learning

What is Gradient Descent in Machine Learning?

Visualizing Stochastic & Batch Gradient Descent in Matplotlib

FoDA - L17 : Gradient Descent for Fitting Data with Squared Loss (Chapter 6.4)

Gradient Descent