Machine Learning Model Comparison with Bootstrap Resampling | sklearn Implementation

preview_player
Показать описание
At some point in your machine learning analysis, you want to be able to say that classifier A is better than B and that this is statistically significant. In this video I will show you a technique to be able to make such a statement which make uses of Bootstrap Resampling.

Acknowledgement:
- music from the youtube library
- used seaborn for the violin plot
- thumbnail made with Canva

This is what Wikipedia as to say about Bootstrap Methods:
"Bootstrapping is any test or metric that uses random sampling with replacement, and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.

Bootstrapping estimates the properties of an estimator (such as its variance) by measuring those properties when sampling from an approximating distribution. One standard choice for an approximating distribution is the empirical distribution function of the observed data. In the case where a set of observations can be assumed to be from an independent and identically distributed population, this can be implemented by constructing a number of resamples with replacement, of the observed data set (and of equal size to the observed data set).

It may also be used for constructing hypothesis tests. It is often used as an alternative to statistical inference based on the assumption of a parametric model when that assumption is in doubt, or where parametric inference is impossible or requires complicated formulas for the calculation of standard errors. "

The technique can be summarized as this:
- Sample your data with replacement to create a bootstrap dataset (same length)
- Run your machine learning pipeline on that bootstrap dataset.
- add the performance metric to your distribution.
- repeat this N amount of time

At the end of this process you will have a distribution of your performance metric. You can then use that distribution to compare it with another model distribution. If the middle 95% of the distributions from model A and B don't overlap you can say that the improvement of model B is statistically significant with p smaller than 0.05 for a bootstrap resampling of n = 1000!

----

----
Follow Me Online Here:

___

Have a great week! 👋
Рекомендации по теме
Комментарии
Автор

Hi, thank you for sharing! Do you mind sharing the code file also? Thanks .

lightsflashing
Автор

Hi, can you share example on fine and gray modeling? Also, where I can get these codes? thanks!

haoduong
Автор

Nice explanation, thank you!
I have a question. This is basically a model selection technique, so in which way is this better (or worse) than other techniques? Why would I want to use this over something like - say - cross validation?

m-pana
Автор

Since you are doing Model Selection, why not use Alkaike Information Criterion?

nadarasarbahavan
welcome to shbcf.ru