Ridge Regression (L2 Regularization)

preview_player
Показать описание
Linear regression is a powerful statistical tool for data analysis and machine learning. But when your hypothesis (model) uses a higher order polynomial, your model could overfit the data. One way to avoid such ovefitting is by using Ridge Regression, or L2 Regularization. It effectively adds a term to the cost function that limits the models parameters values. This is also sometimes referred to as shrinkage.

** SUBSCRIBE:

** Linear Regression with gradient descent (Ordinary Least Squares) video:

** Follow us on Instagram for more endless engineering:
** Like us on Facebook:
** Check us out on twitter:
Рекомендации по теме
Комментарии
Автор

The sign of a great teacher is the ability to make complicated concepts simple to the student. You, my friend, are a great teacher. Thank you!

jehushaphat
Автор

that inverse writting is f...ng awesome

yonko
Автор

I watched many videos about ridge regression. This is the perfect one that I have seen. Majority of the videos just simple talking about working with few parameters and doing a linear fit. You are going above that and discuss how to generalize the Ridge regression. This video is the best.

prasama
Автор

So far after strugglıng for days, I think you make it almost clear for me about how regularization really can decline the effect of theta (or we can say the slope). I checked most of the videos about regularization, and tbh, none helped me to understand that regularization term and how it really affects the slope/steepness. You used Normal Equation to elaborate the idea of regularization that was magnificent to have a clear view about how you can decrease/decline the steepness of theta by varying lambda. The more lamba, the less steep theta be and vice versa.


Unfortunately, most of videos/sources don't elaborate the intuition behind this term and how it really change the thetas/slopes. They all saying the same thing about penalizing/declining the steepness without showing why and how?

hussameldinrabah
Автор

If you don't to any normalization, a reasonable choice for theta can easily be much larger than 1. Since with least squares we have a convex error surface, you don't need to normalize. However, I agree that in general normalizing your data doesn't hurt and in that case your suggestion of picking a value between 0 and 1 makes a lot of sense! Kudos for the nice explanation and derivation!

ahans
Автор

Great explanation. Followed along just fine after reading ISLR ridge section. Helped me see the approach of RR behind the code and text.

nathancruz
Автор

awesome...i was totally confused in ridge regression as i am new to Data science. Thanks a lot for your help.

MrAbhishek
Автор

Madone: This was brilliant. Its going straight from your video into Matlab. I'm beginning to understand the maths of the reservoir computing model echo location I'm writing !!!!
I got this equation of ridge regression from Tholer's PhD : Wout= (Tm'*M)*((M'*M)+B*eye(N))^-1; and there at 8.51 it's derivation is explained. Thanks.

themadone
Автор

Very well explained. Your channel should have a lot more views.

julianocamargo
Автор

That's a kind of video what i was looking for. There is a lot of videos with obvious informations and nothing about mathematical representation and derivatives. You did it very well.
What about constant - theta_{0}? A lot of sources say that theta_{0} shouldn't be regularized and then in the equation instead of identity matrix we should use modified identity matrix with first row full of zeros.

adrianbrodowicz
Автор

Thanks for the wonderful explanation. Could you please make same video for lasso and elastic net.

rohitkamble
Автор

Great explaination but why does Lambda has to be times the identity matrix?

bastiano
Автор

how can I apply this to a small artificial dataet? do you have any examples for that

CarpoMedia
Автор

it would be superb if you could do the same from scratch in python i.e. formulation of matrices X and Y, optimizing the cost function(minima) and arriving at theta.

pranjaldureja
Автор

In the end I was like "wait what are all those formulas???"

WestcoastJourney
Автор

Thank you for you explain it was wonderful. I have a question how can i use the Ridge regression in matlab ? And if i have my input and output how will I use them in the code of ridge regression and what it will be the coefficient in ridge regression ? Please help me i can’t figure it out

fatmahm
Автор

at the beginning, why do you put a bar on top of x?

EW-mbih
Автор

What i didn't understand is, now lambda will be only on the diagonal, and how it'll help (X)TX - lamda(Identical matrix) why just the identical element, why not all

yatinarora
Автор

The final formula is not correct. You should not get identity matrix $I$ in the formula.

anarbay