Unbiased Estimators (Why n-1 ???) : Data Science Basics

Показать описание

Finally answering why we divide by n-1 in the sample variance!

Рекомендации по теме

Комментарии

In order to be even more practical, I would simply say that:
- Mean: You only need 1 value to estimate it. (Mean is the value itself)

- Variance: You need at least 2 values to estimate it. Indeed the variance estimates the propagation between values (the more variance, the more spreaded around the mean it is). It is impossible to get this propagation with only one value.

For me it is sufficient to explain practicaly why it is n for mean and n-1 for variance

davidszmul

I believe this is the best channel I have discovered in a long time. Thanks man.

ashi

Best explanation I've seen on YouTube. Excellent!

Physicsnerd

Great video, now I understand why I failed that test years ago 😅

YusufRaul

How I think about it: suppose you have n data points: x1, x2, x3, x4.., xn. We don't really know the population mean, so let's just pick the data point on our list which is closest to the sample mean, and use this to approximate the population mean. Say this is xi

We can then code the data, by subtracting xi from each element - but this doesn't affect any measure of spread (including the variance). But then after coding we will have a ist x1', x2', ...., xn' but the i'th position will be 0. Then only the other n-1 data points will contribute to the spread around the mean, so we should take the average of these n-1 square deviations.

jamiewalker

I wish you provide all math related to ml and data science

abderrahmaneisntthatenough

I am reading a book on Jim Simons, who ran the Medallion fund. I’ve gone down the rabbit hole of Markov chains and this is an excellent tutorial. Thank you.

Matthew-ezze

been trying to understand this for weeks now, this video cleared it all up. THANK YOU :))

cadence_is_a_penguin

The lucidity of this explanation is commendable.

stelun

Thanks!! I love the way of saying "boost the variance."

junechu

What about n-2 or n-p, howcome more estimators we have the more we adjust? How does it exactly transfer intro calculation and ehat is the logic behind it?

vvalkvvalk

I watch all your vids in my free time. Thanks for sharing!

DistortedV

That last blue equation looks more straightforward to me as -

= [n/(n-1)] [σ²-σ²/n]
=[σ²n/(n-1)] [1-1/n]
=σ²[(n-1)/(n-1)] = σ²

... but that's entirely my problem. :D

Anyway, great video, well done, many thanks!

PS - On the job we used to say that σ² came from the whole population, n, but s² comes from n-1 because we lost a degree of freedom when we sampled it. Not accurate but a good way to socialize the explanation.

Ni

Thanks for the great explanation! But one question! why minus 1? Why not 2? I know the DoF concept would come over here! but all the explanation I have gone through, they have fixed the value of the mean so as to make the last sample not independent!
but in reality as we take samples the mean is not fixed! It is itself dependent on the value of the samples! then DoF would be number of samples itslef!

kvs

Please do one lesson on the concept of ESTIMATORs. It would be good if the basics of these ESTIMATORs is understood before getting into the concept of being BIASED or not. Anyways, you are doing extremely good and you way of explanation is simply superb. clap.. clap ..

ChakravarthyDSK

I wanted to ask a question. For E(x bar), x bar is calculated using a sample of size n, so is E(x bar) the average value of x bar over all samples of size n? Other than that, I think this has been one of the more informative videos on this topic. Additionally, many times people tie in the concept of degrees of freedom into this, but usually they show why you have n-1 degrees of freedom and then just say "that's why we divide by n-1", I understand why it's n-1 degrees of freedom, but not how that justifies dividing by n-1. I was wondering if you had any input on this?

braineater

is that because of we lose 1 degree of freedom when we used the estimated mean to calculate the estimated variance?

yitongchen

Try explaining the above ideas using the degrees of freedom.

DonLeKouT

Thank you. Could you please do a clip on Expected value and it's rules and how to derive some results.

Set_Get

well explained very clear to understand

richardchabu

Unbiased Estimators (Why n-1 ???) : Data Science Basics

Unbiased Estimators (Why n-1 ???) : Data Science Basics

What is an unbiased estimator? Proof sample mean is unbiased and why we divide by n-1 for sample var

Proof that the Sample Variance is an Unbiased Estimator of the Population Variance

Review and intuition why we divide by n-1 for the unbiased sample | Khan Academy

The Sample Variance: Why Divide by n-1?

Why Sample Variance is Divided by n-1

Another simulation giving evidence that (n-1) gives us an unbiased estimate of variance

Variance: Why n-1? Intuitive explanation of concept and proof (Bessel‘s correction)

Unbiasedness vs consistency of estimators - an example

Why We Divide by N-1 in the Sample Variance (The Bessel's Correction)

An unbiased estimator of the population variance

Biased and unbiased estimators from sampling distributions examples

Unbiased estimator : Sample variance, Why divide by n-1 - very intuitive

3.2.2 Measures of Variation - Introduction to Biased and Unbiased Estimators

How to tell if an estimator is biased or unbiased

Simulation providing evidence that (n-1) gives us unbiased estimate | Khan Academy

(IS05) Unbiased Estimators

Estimation Theory | Unbiased estimator | L17

Proof ols estimator is unbiased

Unbiased Estimator for the Variance | Full Derivation

Unbiased Estimation

Unbiased Estimator

Unbiased Estimator of Variance (to divide by n or n - 1)

The Sample Mean is a Consistent Estimator of the Population Mean