Machine Learning: Python Multiple Linear Regression | Predict house price | Predictive Analytics

preview_player
Показать описание
This Linear regression tutorial performs a multi regression task on a target variable based on many independent variables on a housing dataset. It is a step by step tutorial that guides you through Exploratory Data Analysis. Data Processing, fixing Skewness, Encoding Categorical Variables and treating missing or NaN values. Once the data in proper shape we create and train the Linear Regression model.

We use training dataset that has property attributes and corresponding Sale Price. We train a linear regression model using sklearn library. Using this regression model we then predict, check predictions accuracy and compare it against actuals.

To download date and Notebook go to:
Click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.

Link to Exploratory Data Analysis with Python video:

#MachineLearning #LinearRegression #MultiLinearRegression

Topics covered in this Machine Learning Video:
0:00 Multiple linear regression Overview
1:02 Factors to consider
1:47 Import required libraries
2:20 Import data in dataframe
2:39 Exploratory data analysis
3:56 Correlation
4:36 Scatter plot
5:33 Data Distribution - density plot
5:57 Data Preprocessing
6:12 Remove Outliers and missing values
6:40 Encoding Categorical Data
7:49 Log transformation
8:23 Split data in train and test
9:23 Create and train Linear Regression model
9:39 Check accuracy of predictions
9:50 Predict a single value
10:39 Compare Actual vs Predicted values
11:10 Model preview in web app
Рекомендации по теме
Комментарии
Автор

very good video! I am going through it. Question, after line 16, is there a way to export to CSV to see the actual difference compared to original? I feel like this would help me.
# fill missing values based on probability of occurrence
for column in train.columns:
null_vals = train.isnull().values
a, b = np.unique(train.values[~null_vals], return_counts = 1)
train.loc[train[column].isna(), column] = np.random.choice(a, train[column].isnull().sum(), p = b / b.sum())

ShiftKoncepts
Автор

What code modification would I need to make if I only wanted to use the top two correlated features such as "GrLivArea" and "OverallQual"?

ShiftKoncepts
Автор

I'm getting an error when I try to plot the actual vs predicted "NameError: name 'predictions' is not defined"

ShiftKoncepts
Автор

Hello, thanks for this informative vedio. But one thing I felt missing.at end you shown prediction base on web application but how if we want to put x values manually and check predicted values in above program could you explain will b more pleasure

bhushannarkar
visit shbcf.ru