filmov
tv
Machine Learning Model Comparison with Bootstrap Resampling | sklearn Implementation

Показать описание
At some point in your machine learning analysis, you want to be able to say that classifier A is better than B and that this is statistically significant. In this video I will show you a technique to be able to make such a statement which make uses of Bootstrap Resampling.
Acknowledgement:
- music from the youtube library
- used seaborn for the violin plot
- thumbnail made with Canva
This is what Wikipedia as to say about Bootstrap Methods:
"Bootstrapping is any test or metric that uses random sampling with replacement, and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.
Bootstrapping estimates the properties of an estimator (such as its variance) by measuring those properties when sampling from an approximating distribution. One standard choice for an approximating distribution is the empirical distribution function of the observed data. In the case where a set of observations can be assumed to be from an independent and identically distributed population, this can be implemented by constructing a number of resamples with replacement, of the observed data set (and of equal size to the observed data set).
It may also be used for constructing hypothesis tests. It is often used as an alternative to statistical inference based on the assumption of a parametric model when that assumption is in doubt, or where parametric inference is impossible or requires complicated formulas for the calculation of standard errors. "
The technique can be summarized as this:
- Sample your data with replacement to create a bootstrap dataset (same length)
- Run your machine learning pipeline on that bootstrap dataset.
- add the performance metric to your distribution.
- repeat this N amount of time
At the end of this process you will have a distribution of your performance metric. You can then use that distribution to compare it with another model distribution. If the middle 95% of the distributions from model A and B don't overlap you can say that the improvement of model B is statistically significant with p smaller than 0.05 for a bootstrap resampling of n = 1000!
----
----
Follow Me Online Here:
___
Have a great week! 👋
Acknowledgement:
- music from the youtube library
- used seaborn for the violin plot
- thumbnail made with Canva
This is what Wikipedia as to say about Bootstrap Methods:
"Bootstrapping is any test or metric that uses random sampling with replacement, and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.
Bootstrapping estimates the properties of an estimator (such as its variance) by measuring those properties when sampling from an approximating distribution. One standard choice for an approximating distribution is the empirical distribution function of the observed data. In the case where a set of observations can be assumed to be from an independent and identically distributed population, this can be implemented by constructing a number of resamples with replacement, of the observed data set (and of equal size to the observed data set).
It may also be used for constructing hypothesis tests. It is often used as an alternative to statistical inference based on the assumption of a parametric model when that assumption is in doubt, or where parametric inference is impossible or requires complicated formulas for the calculation of standard errors. "
The technique can be summarized as this:
- Sample your data with replacement to create a bootstrap dataset (same length)
- Run your machine learning pipeline on that bootstrap dataset.
- add the performance metric to your distribution.
- repeat this N amount of time
At the end of this process you will have a distribution of your performance metric. You can then use that distribution to compare it with another model distribution. If the middle 95% of the distributions from model A and B don't overlap you can say that the improvement of model B is statistically significant with p smaller than 0.05 for a bootstrap resampling of n = 1000!
----
----
Follow Me Online Here:
___
Have a great week! 👋
Комментарии