Data Science Interview Questions- Multicollinearity In Linear And Logistic Regression

Показать описание

Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more

Please do subscribe my other channel too

Connect with me here:

Рекомендации по теме

Комментарии

Diff 1 --->Gradient descent takes all the data point into consideration to update the weight during back propagation to minimize the loss stochastic gradient descent considers only one data point at a time for weight updation.

Diff 2 ----> In gradient descent convergence towards the minima is as in stochastic gradient descent convergence is slow.

Since in gradient descent whole data points are loaded and use for calculation, computation get as stochastic gradient descent is comparatively fast.

harshstrum

GD: Run all samples in training to do a single update for all params in a specific iteration
SGD: Only one or subset of training sample from training set to update parameter in a specific iteration
GD: If sample/features are larger it takes much time in updating the values
SGD: It is faster because there is one training sample
SGD conveges faster than GD.

ShivShankarDutta

multicollinearity may not be a problem every time. The need to fix multicollinearity depends primarily on the below reasons:

When you care more about how much each individual feature rather than a group of features affects the target variable, then removing multicollinearity may be a good option
If multicollinearity is not present in the features you are interested in, then multicollinearity may not be a problem.

bharathjc

Hi, Krish
Gradient descent : on big volume of data it takes more number of iterations, for each iteration it works with entire data so casuses High latency and more computing power,
Solution : batch gradient
Batch gradient : data is splitted into multiple batches, on each batch gradient will be applied separately, for each batch separate minimum loss is achieved, it considers finally the weight matrix of global minimum loss
Problem with batch gradient : each batch contains few patterns the entire data, that means missing other patterns, model couldn't learn all patterns from the data

mahender

In batch gradient descent, you compute the gradient over the entire dataset, averaging over potentially a vast amount of information.
It takes lots of memory to do that. But the real handicap is the batch gradient trajectory land you in a bad spot (saddle point).

In pure SGD, on the other hand, you update your parameters by adding (minus sign) the gradient computed on a single instance of the dataset.
Since it's based on one random data point, it's very noisy and may go off in a direction far from the batch gradient.
However, the noisiness is exactly what you want in non-convex optimization, because it helps you escape from saddle points or local minima
GD theoretically minimizes the error function better than SGD. However, SGD converges much faster once the dataset becomes large.
That means GD is preferable for small datasets while SGD is preferable for larger ones..

bharathjc

@2:26 could you please explain what disadvantage can it cause to model performance? I mean, what if I remove correlated features, will my model performance increase or stays the same?

rahuldey

GD algorithm uses all data for updating weights when optimising loss function in BP algorithm. However SGD uses a sample data at each iteration.

brahimaksasse

Let's assume we are use a MSE cost function
Gradient Descent -> It takes all the points into account for computing the derivatives of the cost function w.r.t each feature which tells the right direction to move. It is not productive if we have a large number of data points.
SGD -> It computes the derivatives of the cost function w.r.t each feature based a single or some subset of data points and moves in that direction pretending it was the right direction. So, it decreases much of the computational complexity.

sathwickreddymora

Lasso and Ridge Regression - precondition is that there should not be multicollinearity, if we see linear relationship between the independent variables like how we see it with dependent and independent variables we call it multicollinearity which is not the same as correlation

sridhar

Sir, kindly make all the videos of feature engineering and Feature selection which is present in your Github Link.. please..

cutyoopsmoments

Sir can we use pca to reduce multicollinarity if we have suppose more than 200 columns??

sarveshmankar

Stoic Gradient Descent is a type where the Feature Values are Taken randomly Unlike the other Type of Gradient Descent where the global minima is found out after training the Entire Model.

K-mkpc

If you have a a large feature space that contains multicollinearity, you could also try running a PCA and use only the first n components in your model (where n is the number of components that collectively explain at least 80% of the variance), since they are by definition orthogonal to each other.

DionysusEleutherios

Hi Krish...Thanks for such clear explanation. For large datasets, for regression problems we have ridge and lasso. What about classification problem..How to deal with multi collinearity for large datasets?

charlottedsouza

In addition, can you create a separate playlist for interview question so that it is all in one place?

charlottedsouza

but how do you actually pick which feature to drop? f1 or f2?

sebastianroubert

@krishNaik you can add the links for lasso and ridge regularization techniques in this current video. That would be helpful and beneficial for both parties as well I think.

dragonhead

is it recommended to remove highly negative correlated.

Arjungtk

That was an clear explanation ..Thanks Krish.. Small request can you make a video for feature selection using atleast 15-20 variables based on multicollinearity for better understanding by practice..

ganeshprabhakaran

When u are using small dataset and x1, x2 are highly correlated, drop which one?

haneulkim

Data Science Interview Questions- Multicollinearity In Linear And Logistic Regression

Data Science Interview Questions- Multicollinearity In Linear And Logistic Regression

The Multicollinearity Problem | Data Science/Data Analytics Interview questions #2

MULTICOLLINEARITY | DATA SCIENCE INTERVIEW QUESTIONS

What is Multi-collinearity? | Machine Learning Interview Questions

What is Multicollinearity | Data Science Interview Questions and Answers | Thinking Neuron

Multicollinearity on logistic regression | Data Science/Machine Learning Interview Question

Data Science Interview Questions on Regression Analysis || Statistical Modelling

Data Science Interview Question 11 | #programmingcradle #datascience #machinelearning

3 Must-Know Data Science Interview Questions (Explained Fast!)

How To Handle Multicollinearity and Feature Selection [DoorDash Data Science Project]

Machine learning Interview Questions: Why Multicollinearity is a problem in Machine learning?

Linear regression interview questions | Machine Leaning | Data science explained

How to handle multicollinearity in linear regression?#datascienceinterviewquestions #machinelearning

Basic assumptions of Linear Regressions | Data Science Interview Questions | Machine Learning

Master Multicollinearity in Just 15 Seconds! The Essential Concept for Data Analysis and Statistics.

Top 10 Interview questions on logistic regression #datascience #machinelearning

Data Science Interview Question : Linear Regression Assumptions

Analytics Interviews: Frequently Asked Questions on Regression - Part1

Top-5 Interview Questions for Data Scientists | Part-2 |

Interview Prep Day 2- Linear Regression Interview Question-The Most Important Algorithm In ML & ...

Daily Data Science Interview Question (20 of n):

30 Days Interview Challenge For Data Science- Ft: Ineuron

Regression analysis interview questions | Regression analysis Questions asked in real Interviews

Most asked interview questions on Linear regression #datascience #machinelearning #linearregression