Unbiased Estimators (Why n-1 ???) : Data Science Basics

preview_player
Показать описание
Finally answering why we divide by n-1 in the sample variance!
Рекомендации по теме
Комментарии
Автор

In order to be even more practical, I would simply say that:
- Mean: You only need 1 value to estimate it. (Mean is the value itself)

- Variance: You need at least 2 values to estimate it. Indeed the variance estimates the propagation between values (the more variance, the more spreaded around the mean it is). It is impossible to get this propagation with only one value.

For me it is sufficient to explain practicaly why it is n for mean and n-1 for variance

davidszmul
Автор

I believe this is the best channel I have discovered in a long time. Thanks man.

ashi
Автор

Best explanation I've seen on YouTube. Excellent!

Physicsnerd
Автор

Great video, now I understand why I failed that test years ago 😅

YusufRaul
Автор

How I think about it: suppose you have n data points: x1, x2, x3, x4.., xn. We don't really know the population mean, so let's just pick the data point on our list which is closest to the sample mean, and use this to approximate the population mean. Say this is xi

We can then code the data, by subtracting xi from each element - but this doesn't affect any measure of spread (including the variance). But then after coding we will have a ist x1', x2', ...., xn' but the i'th position will be 0. Then only the other n-1 data points will contribute to the spread around the mean, so we should take the average of these n-1 square deviations.

jamiewalker
Автор

I wish you provide all math related to ml and data science

abderrahmaneisntthatenough
Автор

I am reading a book on Jim Simons, who ran the Medallion fund. I’ve gone down the rabbit hole of Markov chains and this is an excellent tutorial. Thank you.

Matthew-ezze
Автор

been trying to understand this for weeks now, this video cleared it all up. THANK YOU :))

cadence_is_a_penguin
Автор

The lucidity of this explanation is commendable.

stelun
Автор

Thanks!! I love the way of saying "boost the variance."

junechu
Автор

What about n-2 or n-p, howcome more estimators we have the more we adjust? How does it exactly transfer intro calculation and ehat is the logic behind it?

vvalkvvalk
Автор

I watch all your vids in my free time. Thanks for sharing!

DistortedV
Автор

That last blue equation looks more straightforward to me as -

= [n/(n-1)] [σ²-σ²/n]
=[σ²n/(n-1)] [1-1/n]
=σ²[(n-1)/(n-1)] = σ²

... but that's entirely my problem. :D

Anyway, great video, well done, many thanks!

PS - On the job we used to say that σ² came from the whole population, n, but s² comes from n-1 because we lost a degree of freedom when we sampled it. Not accurate but a good way to socialize the explanation.

Ni
Автор

Thanks for the great explanation! But one question! why minus 1? Why not 2? I know the DoF concept would come over here! but all the explanation I have gone through, they have fixed the value of the mean so as to make the last sample not independent!
but in reality as we take samples the mean is not fixed! It is itself dependent on the value of the samples! then DoF would be number of samples itslef!

kvs
Автор

Please do one lesson on the concept of ESTIMATORs. It would be good if the basics of these ESTIMATORs is understood before getting into the concept of being BIASED or not. Anyways, you are doing extremely good and you way of explanation is simply superb. clap.. clap ..

ChakravarthyDSK
Автор

I wanted to ask a question. For E(x bar), x bar is calculated using a sample of size n, so is E(x bar) the average value of x bar over all samples of size n? Other than that, I think this has been one of the more informative videos on this topic. Additionally, many times people tie in the concept of degrees of freedom into this, but usually they show why you have n-1 degrees of freedom and then just say "that's why we divide by n-1", I understand why it's n-1 degrees of freedom, but not how that justifies dividing by n-1. I was wondering if you had any input on this?

braineater
Автор

is that because of we lose 1 degree of freedom when we used the estimated mean to calculate the estimated variance?

yitongchen
Автор

Try explaining the above ideas using the degrees of freedom.

DonLeKouT
Автор

Thank you. Could you please do a clip on Expected value and it's rules and how to derive some results.

Set_Get
Автор

well explained very clear to understand

richardchabu