Preprocessing data for Machine Learning - Deep Dive

preview_player
Показать описание
Logistic Regression - Preprocessing Cheat Sheet! How do we deal with logistic regression through Preprocessing? Please Subscribe!

SPONSOR
Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite. Love it! Learn more here:

TIMESTAMPS
0:00 Introduction
1:34 Snape – Artificial Data Generator
2:56 Effects of Standardization
5:32 Effects of Encoding
8:45 Effects of Data Imbalance
10:50 Effects of Correlation
11:40 Variance Inflation Factor Explained
13:30 Dealing with Multicollinearity
16:53 Effects of Missing Data
18:09 Summary

#machinelearning #logisticregression #artificialintelligence #AI
Рекомендации по теме
Комментарии
Автор

You had me up until you used OLS to understand how each feature impacted the logistic regression’s output. Logistic regression models ≠ OLS models, and you can’t use OLS to determine which features are statistically significant in a Logistic regression model.

You can fit a Logistic Regression model with statsmodels by using the GLM method and specifying that the family argument be equal to Binomial. Then you can fit the model and get the summary output just like you did for OLS, and that can be used to determine which features are statistically significant.

Otherwise good info!

shnibbydwhale
Автор

One interesting alternative to missing data imputation would be: train a model using the other features to predict the missing one.
Thx for the video!

Raulvic
Автор

Thank you so so much! You helped me a lot with my bachelor thesis!

theyseemerollintheyhatin
Автор

Hey This Video is pretty SICK!! 😅 Awesome work man! Smashed the like button.

NicholasRenotte
Автор

Wow, great video. I learnt more about multicollinearity, thanks!

ramonsantiago
Автор

Great content! I’m curious why you use DataFrameMapper to map columns to transformers instead of ColumnTransformer. It seems like pipeline steps in the latest version of sklearn work ok with dataframes as input. Is it to also be able to see the transformer output as a dataframe?

hansenmarc
Автор

Amazing content bro. Could you please update your channel playlists bro. Thanks

teetanrobotics
welcome to shbcf.ru