8.3 Bias-Variance Decomposition of the Squared Error (L08: Model Evaluation Part 1)

preview_player
Показать описание
In this video, we decompose the squared error loss into its bias and variance components.

-------

This video is part of my Introduction of Machine Learning course.

-------

Рекомендации по теме
Комментарии
Автор

This was life saving. Thank you so much Sebastian. Especially for explaining why 2ab = 0 while deriving the decomposition

elnuisance
Автор

this is how you teach machine leanring, respectfully the prof. at my university needs to take notes!

kairiannah
Автор

Hi Professor, thank you so much for the excellent explanation!! I learned bias variance decomposition long time ago but never fully understand it until I watch this video! Detailed explanation of each definition helps a lot. Also, with the code implementation, it helps me not only understand the concepts but also be able to implement into the real application which is the part I always struggle with! I'll definitely find time to watch other videos to make my ML foundation more solid.

bluepenguin
Автор

Thank you so much for the intuitive explanation! The notations are clear to understand and it just instantly clicked.

stslzop
Автор

This was wonderful Sebastian after looking no such video available on you tube with such explanation

whenmathsmeetcoding
Автор

The best explanation of bias & variance I've countered so far.
it would be helpful if you could include the "noise" too.

PriyanshuSingh-hmtn
Автор

Thank you so much for the bias variance videos. Though I intuitively understood it, these equations never made sense to me before I watched the videos. Truly appreciated!!

gurudevilangovan
Автор

Do you know that you are doing truly good work! clear to every single details

khuongtranhoang
Автор

Thanks for this! Provides one of the best explanations👏

ashutoshdave
Автор

Thank you so much. This helps me to understand perfectly about Bias-Variance mathmetically.

krislee
Автор

Thank you for this great lecture series!

imvijay
Автор

Hi, thanks for teaching, really helpful 😊

siddhesh
Автор

I have a couple of questions: Regarding the variance, is this calculated across different parameter estimates given the same functional form of the model? Also, these parameter estimates depend on the optimization algorithm used, right, ie., implying the model predictions are 'empirically-derived models' vs. some sort of theoretically optimal parameter combinations, given a particular functional form? If so, would this mean that _technically speaking_, there is an additional source of error in the loss calculation, which could be something like 'implementation variance' due to our model likely not having the most optimal parameters, compared to some theoretical optimum? Hope this makes sense, I'm not a mathematician. Thanks!

Rictoo
Автор

This is an absolutely brilliant video Sebastian - thank you.

I have no problem deriving the Bias-Variance Decomposition mathematically, but no one seems to explain what the variance or expectation is with respect to - is it just on one value? over multiple training sets? different values within one training set? You explained it excellently.

andypandyify
Автор

At 10:20, the bias comes out backward because the error should be y_hat - y, not y - y_hat. The "true value" in an error is substracted from the estimate. Not the other way around. This is easily remembered from thinking of a a simple random variable with mean mu and error e: y = mu + e. Thus, e = y - mu.

justinmcgrath
Автор

When you say bias^2+variance that is for a single model
In the beginning you said bias and variance for different models trained on different datasets which one is it?
If we consider single model then bias is nothing but mean error and variance is mean squared error?

bashamsk
Автор

Thanks for the great video! One question: 8:42 why y is constant? y=f(x) here also has distribution, is a R.V. is that correct? and when you say "apply expectation on both sides, this expectation over y or over x?

kevinshao
Автор

Professor, does your bias_variance_decomp work in google colab? It did not with me. It worked just fine in Jupyter. But the problem with Jupyter is that bagging is way slower (that's my computer) than what I could get in colab.

DeepS
Автор

any good sources or hints on dataset stratification for regression problems ?

sefirot
Автор

I don’t understand why you can’t multiply ‘E’ the expectation by ‘y’ the constant

jayp