Handling Imbalanced Datasets in Python with Stratified Split, SMOTE and Random Oversampling

preview_player
Показать описание
In this video, we discuss handling imbalanced datasets in a classification context by using a number of different sampling techniques in python.

We begin by using a stratified split technique to ensure the training and test sets have an equal proportion of samples from each class. We then move on to the business of handling imbalanced datasets by employing the SMOTE technique, which oversamples the minority class by creating synthetic observations and Random Oversampling which oversamples instances from the minority class. SMOTE and Random Oversampling both rely on the imbalanced learn library (imblearn).

Рекомендации по теме
Комментарии
Автор

good video. my doubt is cleared regarding stratified and smote technique . Confusion about which one to use before. Thanks.

VanithaSRA