Python Tutorial : Probability mass and distribution functions

preview_player
Показать описание

---

After conducting many random experiments, you will notice that some outcomes are more likely than others. This is called a probability distribution.

There are two important functions that are useful for probability calculations: the probability mass function and the cumulative distribution function.

A discrete random variable has a finite number of possible outcomes.
The probability mass function allows you to calculate the probability of getting a particular outcome for a discrete random variable.

The binomial probability mass function allows you to calculate the probability of getting k heads from n coin flips with p probability of getting heads.

The formula multiplies the number of different ways that you can get k successes out of n coin flips...

by the probability of success raised to the number of successes, k...

by the probability of failure, 1 - p, raised to the number of failures, n - k.
It's okay if you don't understand the formula right now. With practice, your intuition about this will grow.

If we plot the probability mass function of getting k heads out of 10 fair coin flips, you can see that 5 is the most likely outcome.

With the scipy dot stats library we can use the binom dot pmf function to calculate this probability.

If you use binom dot pmf with parameters k equals 2, n equals 10, and p equals 0.5 you get the probability of getting 2 heads from 10 flips of a fair coin -- that is, 4%.

The probability of getting 5 heads from 10 coin flips is almost 25%.

The probability of getting 50 heads out of 100 flips of a biased coin with 30% probability of getting heads is extremely small: not even a 1% chance.

If instead you calculate the probability of getting 65 heads from 100 flips of a biased coin with 70% probability of getting heads, you see that it's almost 5%.

As n gets larger, the probability of getting k heads becomes smaller for the same p.

If you instead want to calculate the probability of getting k or fewer heads from n throws, you use the binomial probability distribution function, which adds the probabilities of getting 0 heads out of n flips, getting heads once out of n flips, and getting all the way up to k heads out of n flips.

The binomial probability distribution function allows us to calculate the cumulative probability of getting k heads or fewer from n coin flips with p probability of getting heads.

In Python we use the binom dot cdf function with parameters k, n, and p. Adding the probabilities from the mass function, we get the cumulative distribution function (cdf).

This is a way of getting a range of probabilities rather than the probability of a single event.

With the scipy dot stats library, we can use the binom dot cdf function to get such a probability using the same parameters.

If you use binom dot cdf with parameters k equals 5, n equals 10, and p equals 0.5 you get the probability of getting heads 5 times or fewer out of 10 flips, which is 62%.

The probability of getting heads 50 times or fewer out of 100 flips of a biased coin with 30% probability of getting heads is near 100%. It's almost guaranteed.

The probability of getting heads more than 59 times from 100 flips of a biased coin with p equal to 70% is 99% -- again, it's almost certain.

What if we want the probability of getting heads more than k times?

This is called the complement, and we get it by subtracting the CDF from 1.

Alternatively, we can calculate the complement using the function binom dot sf with the same parameters. sf stands for survival function, which allows you to get tail probabilities or the complement in this case.

We've had some fun calculating probabilities. Now let's practice some more.

#DataCamp #PythonTutorial #FoundationsofProbabilityinPython
Рекомендации по теме
Комментарии
Автор

How can I calculate joint distribution for datasets which contain six variables. Each variable contain 0 or 1? how plot this joint distribution ? with python

fadiabougacha