filmov
tv
An Introduction to the Hypergeometric Distribution
Показать описание
An introduction to the hypergeometric distribution. I briefly discuss the difference between sampling with replacement and sampling without replacement. I describe the conditions required for the hypergeometric distribution to hold, discuss the formula, and work through 2 simple examples.
I also discuss the relationship between the binomial distribution and the hypergeometric distribution, and a rough guideline for when the binomial distribution can be used as a reasonable approximation to the hypergeometric. I finish with a brief example involving the multivariate hypergeometric distribution.
For those using R, here is the R code to find the probabilities for the examples in this video:
The probability of picking exactly 4 red balls when picking 5 balls from a source containing 6 red and 14 yellow.
Without replacement (hypergeometric):
choose(6,4)*choose(14,1)/choose(20,5)
[1] 0.01354489
or
dhyper(4,6,14,5)
[1] 0.01354489
With replacement (binomial):
dbinom(4,5,6/20)
[1] 0.02835
The probability of picking exactly 7 females when randomly sampling from a school with 1100 female and 900 male students.
Without replacement (hypergeometric):
choose(1100,7)*choose(900,3)/choose(2000,10)
[1] 0.1664901
or
dhyper(7,1100,900,10)
[1] 0.1664901
With replacement (binomial):
dbinom(7,10,1100/2000)
[1] 0.1664783
Multivariate hypergeometric, probability of picking exactly 3 Democrats, 2 Republicans, and 1 independent in the sample.
choose(12,3)*choose(24,2)*choose(8,1)/choose(44,6)
[1] 0.06881377
or, with the extraDistr package installed:
dmvrhyper(c(3,2,1),c(12,24,8),6)
I also discuss the relationship between the binomial distribution and the hypergeometric distribution, and a rough guideline for when the binomial distribution can be used as a reasonable approximation to the hypergeometric. I finish with a brief example involving the multivariate hypergeometric distribution.
For those using R, here is the R code to find the probabilities for the examples in this video:
The probability of picking exactly 4 red balls when picking 5 balls from a source containing 6 red and 14 yellow.
Without replacement (hypergeometric):
choose(6,4)*choose(14,1)/choose(20,5)
[1] 0.01354489
or
dhyper(4,6,14,5)
[1] 0.01354489
With replacement (binomial):
dbinom(4,5,6/20)
[1] 0.02835
The probability of picking exactly 7 females when randomly sampling from a school with 1100 female and 900 male students.
Without replacement (hypergeometric):
choose(1100,7)*choose(900,3)/choose(2000,10)
[1] 0.1664901
or
dhyper(7,1100,900,10)
[1] 0.1664901
With replacement (binomial):
dbinom(7,10,1100/2000)
[1] 0.1664783
Multivariate hypergeometric, probability of picking exactly 3 Democrats, 2 Republicans, and 1 independent in the sample.
choose(12,3)*choose(24,2)*choose(8,1)/choose(44,6)
[1] 0.06881377
or, with the extraDistr package installed:
dmvrhyper(c(3,2,1),c(12,24,8),6)
Комментарии