Feature Engineering-How to Transform Data to Better Fit The Gaussian Distribution-Data Science

Показать описание

Some machine learning models like linear and logistic regression assume that the variables are normally distributed. Others benefit from "Gaussian-like" distributions, as in such distributions the observations of X available to predict Y vary across a greater range of values. Thus, Gaussian distributed variables may boost the machine learning algorithm performance.

Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more

Please do subscribe my other channel too

If you want to Give donation to support my channel, below is the Gpay id

Connect with me here:

Рекомендации по теме

Комментарии

It is great having you as a digital mentor in this QUARANTINE
Great job 👍

shantanudolui

Thank you so much Krish for such an informative lecture. God bless you dear.

wajdilas

In exponential distribution case why you didn't use exp_fare variable in diagnostic.plot (you took sqr_fare for the same.)

mahadevkandalkar

Really amazing video
Thank you so much sir
Explained beautifully

thedataguyfromB

Heartly thanking you so much Sir for your efforts.

anandhuded

Sir...after transforming the data into Gaussian distribution also should we need to perform feature scaling like min-max scalar or standard scalar...??
or can we perform any one among (Gaussian Transformation (or) Feature Scaling) on a particular feature..??

phanindratangirala

The features(regressors) do not have to normally distributed. If there is heterosceadasity in the residuals, it's like that the model is underspecified and transformation of variables is one of the techniques used to eliminate it.

swat_katz_tbone

Thanks for nice video. I have one suggestion pls use feature-engine library for missing value imputation. Simply don't use that much manual code.

shreyasb.s

The assumption of linearity, implies that, regression must be linear in the coefficients and not the independent variables. Eg, we can have regression equation like, y = m1X1 + m2 X2 ^ 2 + c. This means y is linearly related to m1, m2. but not with X1, X2 etc. We cannot have equations like
y = m1^2X1 + m2^2X2.... So point No.1 mentioned in the video is incorrect.

manishgaurav

should i always transform non-normal independent variables? Transforming the variables seems to be changing the interpretation of the variables itself. In that how do I handle the outliers in independent variables? Can I go for outliers trimming and capping then? Can I leave the independent variables sknewness as it is.?kindly advice.

shobithas

Hi Krish,
How to reverse transform the transformed variable after doing prediction to come out with the actual predicted number. I am having problems in a project. Please help

chayanmehrotra

Great Job Sir.Such a well Explained Video. But I am not able to get that Imputation technique.If Possible plz explain that part.

saswatpriyabrat

Hi sir, thanks for great videos, please upload nlp playlist please, first like sir

Trouble.drouble

It's really the best explanation you have provided. I appreciate it. By the way, there is a mistake at time stamp 20:39. You have plotted square root fare again instead of Exponential fare.

ganeshnvsnm

Does a Gaussian distribution affect the accuracy of OLS Linear regression or is it applicable only for gradient descent linear regression?

kirangeorge

Please make a separate video on Box Cox

raneshmitra

Great job mate, quick q, As far as I know, linear regression does not really assume the need for feature normality. Can you point to source from any literature ?

justfun

Can we do a transformation on a feature more than once? Like first we do a exponential transformation and then do a logarithmic transformation, or something on these lines of thinking?

anamitrasingha

Hi krish. I think that logistic regression does not assume the variables to be normally distributed? Can you throw a light on this?

lakshitakamboj

Sir, shouldn't the outlier treatment be done before this step?

vigneshg

Feature Engineering-How to Transform Data to Better Fit The Gaussian Distribution-Data Science

Feature Engineering-How to Transform Data to Better Fit The Gaussian Distribution-Data Science

Discussing All The Types Of Feature Transformation In Machine Learning

What is feature engineering | Feature Engineering Tutorial Python # 1

Step By Step Process In EDA And Feature Engineering In Data Science Projects

Feature Engineering Secret From A Kaggle Grandmaster

Intro to Feature Engineering with TensorFlow - Machine Learning Recipes #9

Feature Engineering Full Course - in 1 Hour | Beginner Level

One Button Machine : Automated Feature Engineering

Building Your First Machine Learning Model

Feature Engineering with Image Data | Aims, Techniques & Limitations

Use FunctionTransformer to convert functions into transformers

How to think feature engineering | feature engineering tutorial | feature engineering explained

Feature Engineering | Applied Machine Learning, Part 1

What is Feature Engineering Explained in Hindi with Examples | Machine Learning

Introduction to Feature Engineering in Machine Learning

Feature Engineering-How to Perform One Hot Encoding for Multi Categorical Variables

Tons of Machine Learning FEATURE data? Try this!

Feature Transformation comes before Feature Selection or after? #shorts #datascience

ML 7 : Features Selections & Feature Extractions with Examples.

Feature Transformation in Machine Learning and Data Science

Feature Engineering in Machine Learning and Data Science

Difference Between fit(), transform(), fit_transform() and predict() methods in Scikit-Learn

What is Feature Engineering?

Kishan Manani - Feature Engineering for Time Series Forecasting | PyData London 2022