Stanford CS229: Machine Learning | Summer 2019 | Lecture 8 - Kernel Methods & Support Vector Machine

Показать описание

Anand Avati
Computer Science, PhD

To follow along with the course schedule and syllabus, visit:

Рекомендации по теме

Комментарии

Thanks for making such a detailed lecture available online, Dr. Avati!

anirudhthatipelli

Lecture 8 Completed : Understood the general principle of kernalization, Got a great understanding of SVM. Prof. Anand Avati really explains these concepts with clarity.

DevanshChaudhary-duuz

At 59:34, the lecturer says that the reason why Kernel methods means not having a Theta vector at prediction time is because "we give up the phi(x) representation."

I see what he's trying to say, but here's (what I believe to be) a clearer explanation:

We could, theoretically, first calculate Theta by summing up Beta_i * phi(x_i) for all i training examples, and then (at prediction time) multiply that Theta by the phi(x) vector for our test example x. However, each of those two steps would require multiplying by a massive (potentially infinite-dimensional) phi vector. (In fact, i times just for calculating Theta.)

The trick, then, is to end training with our parameter still in terms of phi(x_i), knowing that the paramater will later (at prediction time) be multiplied by the phi(x) of our test example, x -- a multiplication that we will be able to perform very easily, because we can then kernelize the two massive feature vectors with each other! This is obviously something that we are only able to do if we hold on to our phi(x_i) vector from training!

That will obviously leave us with zero massive vector multiplications, instead of tons of them (one per training example during training and one per test example during testing).

waelq

If I understand correctly, we are no longer estimating parameters like theta and instead are storing the values of Beta and all the training examples. Since in this example we would have a tendency to overfit, how would we regularize it? Can we apply Ridge/Lasso on Beta instead of Theta?

AdaGradschool

At 56:41 it should have been beta_i instead of making it a beta_j

durgeshmishra-fnkx

why did u take x1, x2, , x3... at 15:32, input vector is just in terms of x ryt?

sravanthkurmala

Stanford CS229: Machine Learning | Summer 2019 | Lecture 8 - Kernel Methods & Support Vector Machine

Stanford CS229 - Machine Learning University Course - Andrew Ng (Part 1)

Andrew Ng's Secret to Mastering Machine Learning - Part 1 #shorts

Stanford CS229 - Machine Learning University Course - Andrew Ng (Part 2)

How I’d learn ML in 2024 (if I could start over)

Stanford CS229 - Machine Learning University Course - Andrew Ng (Part 3)

What advice do you have for getting started in AI & Machine Learning? - Fei-Fei Li & Andrew ...

Is the Machine Learning Specialization ACTUALLY Worth It? (Andrew Ng)

Harvard and Stanford's FREE A.I. courses!!!!

Stanford's FREE data science book and course are the best yet

AI Machine Learning Roadmap: Self Study AI!

Why and how to start a Machine Learning career | Andrew Ng

#1 Machine Learning Specialization [Course 1, Week 1, Lesson 1]

Andrew Ng: Opportunities in AI - 2023

How Large Language Models Work

Stanford Machine Learning Specialization - Review 2025 - Andrew Ng Machine Learning Specialization

StanFord CS229: Machine Learning (Autumn 2018)| Lecture 1: Welcome! | Re-up (1080p)

(Hindi /Urdu) Stanford CS229 -Machine Learning Intro by Andrew Ng.

Harvard CS50’s Artificial Intelligence with Python – Full University Course

Stanford Machine Learning Lecture 1 - CondensedLectures

Stanford Machine Learning Lecture 2 - CondensedLectures

Andrew Ng's Secret to Mastering Machine Learning - Part 2 #shorts

🤯Free Data Science courses from Harvard and Stanford #datascience #harvard #stanford #upskills

Stanford CS229: Machine Learning | Summer 2019 | Lecture 1 - Español

Stanford CS25: V4 I Overview of Transformers