R Stats: Multiple Regression - Variable Preparation

Показать описание

This video gives a quick overview of constructing a multiple regression model using R to estimate vehicles price based on their characteristics. The video focuses on how to prepare variables while employing a stepwise regression with backward elimination of variables. The lesson explains how to transform highly skewed variables (using Log10 transform) and later report their characteristics, how to check variable normality and their multiple collinearity (using Variance Inflation Factors) and their extreme values (using Cook's distance). The process will be guided by the measures of model quality, such as R-Squared and Adjusted R-Squared statistics, and variables' p-values, which represent the level of coefficient confidence. As always, the final model will be evaluated by calculating the correlation between the predicted and actual vehicle price for both the training and validation data sets, with correction for the previously transformed variables. The explanation will be quite informal and will avoid the more complex statistical concepts. Note that visual presentation and interpretation of multiple regression results will be explained in the next lesson.

The data for this lesson can be obtained from the well-known UCI Machine Learning archives:

The R source code for this video can be found here (some small discrepancies are possible):

Рекомендации по теме

Комментарии

Hi Sir, I see that the final model for your multiple regression after backward elimination only used two variables : Peak.rpm and Curb.weight. When testing the final model with the valid/test set, can't we just do this
valid.sample$Pred.Price <- predict(fit, newdata = valid.sample) ?

Why did you do this instead?

valid.sample$Pred.Price <- predict(fit, newdata = subset(valid.sample, select=c(Price, Peak.rpm, Curb.weight)))

Do you mind explaining why did you need to subset the valid.sample set with just the variables that the model end up using? Like, why does it matter? Thanks!

killa

Very helpful video. Thank you for posting!

allisonhaaning

hello prof. Thank you for all of your lessons. These are really helpful. My question is how we do the back transformation for log10 for report requirements? or how the model equation looks like? thank you in advance.

benediktusnugrohoadiwiyoto

Hi professor, thanks for the great tutorial.
Just for curiosity, why do you use the number 2017 in set.seed()?
Many thanks

klaldju

Hello sir, how do we check for non linearity if the variables are factors instead of numerical?

Or do we just do the full model and then check for linearity from that full model?

harithsyafiqhalim

many thanks nice videio. can u please check the link for r source code. it is not working. thanks

muhammadsaleemkhan

Since this video was created the UCI Machine Learning repository moved to the new location. What it means is that the web location shown in the script is not working. However, I have updated the link to the lesson data in the video description.

ironfrown

what if after eliminating some extreme values, the R-squared instead becomes smaller ?

mohammadumam

shouldn'it be sqrt(vif(fit))) rather?

sambad

Great video but the link to the data is not working. I have cleaned and prepared the data for saving your guys time.
The data is format to be suitable to the R code in a description above.
Data's name is Auto.csv with 205 rows and 26 cols

After importing the data to R and in a process of imputing the NA value, please notice:
auto$Num.of.doors <- as.numeric(impute(auto$Num.of.doors, median)) in the R source code did not wok due to the class of Num.of.doors is character. You have to change to
auto$Num.of.doors <- as.character(impute(auto$Num.of.doors, median)) for functionally working.

xymabuka

R Stats: Multiple Regression - Variable Preparation

Multiple Regression in R, Step by Step!!!

Multiple Linear Regression in R | R Tutorial 5.3 | MarinStatsLectures

Multiple Regression in R, Step-by-Step!!!

Multiple regression: how to select variables for your model

Multiple Regression, Clearly Explained!!!

Multiple regression in R: example

Multiple Regression | ANOVA Table | F-Test | R-square | Standard Error

Multivariable Linear Regression in R: Everything You Need to Know!

Class 11: Generalized Measurement Models (Lecture 04a, part 1, Bayesian Psychometric Models, F2024)

Quick tutorial on how to run Multiple regression in R

Multiple linear regression in R

How To... Make a Prediction using a Multiple Linear Regression Model in R #102

Multiple linear regression with interaction in R

Multiple Regression, Clearly Explained!!!

How To... Create a Multiple Linear Regression Model in R #101

Linear Regression in R, Step by Step

Multiple linear regression using R studio (Aug 2022)

Multiple Regression | Coefficients – Interpretation, C.I, Hypothesis Testing

R Stats: Multiple Regression - Variable Preparation

Multiple regression. How to deal with Outliers and Colliniarity

Multiple Linear Regression using R ( All about it )

Adding variables to your multiple regression model

Multiple Linear Regression with R | 3. Model

R Stats: Multiple Regression - Data Visualisation