Linear Regression in R | Linear Regression in R With Example | Data Science Algorithms | Simplilearn

preview_player
Показать описание

This Linear regression in R video will help you understand what is linear regression, why linear regression, and linear regression in R with example. You will also look at a use case predicting the revenue of a company using multiple linear regression. Now, let's deep dive into this video and understand this data science algorithm.

Below topics are explained in this Linear Regression in Rvideo:
00:00 Introduction
00:28 Why linear regression?
03:09 What is linear regression?
03:38 How linear regression works?
10:05 Use case - Predicting the revenue using linear regression

#LinearRegressionInR #LinearRegression #LinearRegressionInRWithExample #DataScienceAlgorithms #DataScienceWithR #DataScienceCourse #DataScience #DataScientist #MachineLearning

What is Linear Regression?
Linear Regression is the statistical model that is used to predict the relationship between independent and dependent variables by examining two factors. The first one is which variables, in particular, are significant predictors of the outcome variable and the second one is how significant is the regression line to make predictions with the highest possible accuracy.

➡️ About Caltech Post Graduate Program In Data Science
This Post Graduation in Data Science leverages the superiority of Caltech's academic eminence. The Data Science program covers critical Data Science topics like Python programming, R programming, Machine Learning, Deep Learning, and Data Visualization tools through an interactive learning model with live sessions by global practitioners and practical labs.

✅ Key Features
- Simplilearn's JobAssist helps you get noticed by top hiring companies
- Caltech PG program in Data Science completion certificate
- Earn up to 14 CEUs from Caltech CTME
- Masterclasses delivered by distinguished Caltech faculty and IBM experts
- Caltech CTME Circle membership
- Online convocation by Caltech CTME Program Director
- IBM certificates for IBM courses
- Access to hackathons and Ask Me Anything sessions from IBM
- 25+ hands-on projects from the likes of Amazon, Walmart, Uber, and many more
- Seamless access to integrated labs
- Capstone projects in 3 domains
- Simplilearn’s Career Assistance to help you get noticed by top hiring companies
- 8X higher interaction in live online classes by industry experts

✅ Skills Covered
- Exploratory Data Analysis
- Descriptive Statistics
- Inferential Statistics
- Model Building and Fine Tuning
- Supervised and Unsupervised Learning
- Ensemble Learning
- Deep Learning
- Data Visualization

🔥🔥 Interested in Attending Live Classes? Call Us: IN - 18002127688 / US - +18445327688
Рекомендации по теме
Комментарии
Автор

@simplilearn In the subset function = is used in the place of ==. That's why the training and the testing data sets have the same value. Do this changes in the video and it will be useful for many. Thank you.

advertisementmail
Автор

Nice tutorial, but I believe somewhere in finding the accuracy part, it was supposed to be test$Revenue instead of the sales$Revenue, because you are checking the accuracy of the pred model you have created.

jamiiacademy
Автор

Thank you man, you made it so simple. in 3-4 hour lasting classroom lectures, i do not understand shit. However, such videos make it so simple.

ricksanchez
Автор

hello @Simplilearn and beloved comment section, i tried using scatter plot to visualize this data instead of the graph shown in this video, The Code i have written is:

library(ggplot2)

library(scales)

#training data
ggplot() +
geom_point(aes(x = Train$Paid+ Train$Organic+Train$Social, y = Train$Profit, colour = "red"))+
geom_line(aes(x = Train$Paid+ Train$Organic+Train$Social, y = predict(linear_regression, newdata = Train, colour = "navy")))+
ggtitle("Predicted Revenue(training set)")+ xlab("Paid, Organic & Social") + ylab("Revenue")+
scale_x_continuous(limit=c(0, 480000)) +

#testing data
library(ggplot2)
library(scales)

ggplot() +
geom_point(aes(x = Test$Paid+ Test$Organic+Test$Social, y = Test$Profit, colour = "red"))+
geom_line(aes(x = Test$Paid+ Test$Organic+Test$Social, y = predict(linear_regression, newdata = Test, colour = "navy")))+
ggtitle("Predicted Revenue(testing set)")+ xlab("Paid, Organic & Social") + ylab("Revenue")+
scale_x_continuous(limit=c(0, 480000)) +

I WILL BE MORE THAN GRATEFUL IF ANY ONE OF YOU CAN PLEASE VERIFY IT, WHETHER ITS THE CORRECT CODE OR NOT. _/\_.
The plot generated is a bit similar, but still i am confused a little bit, because i wrote it myself.

ankitdas
Автор

Thanks so much, very nicely explained and demonstrated

jgcornell
Автор

Hi there, one of the best tutorial I've ever seen for Linear Regression. I'm so interested in Dataset and I appreciate if you share it to get practice and hands-on with your video

alioraji
Автор

That's great work!
Amazing work
Where can I get the dataset from?

rasalghul
Автор

after splitting the data I tried using the str function on it to see the no. of entries in both the test and train subset, but it still shows the total no. of rows as the main data set, so are the entries getting split or not?

durveshsawant
Автор

When you ran the split object why was there only four Boolean variables in the output ?

jayjayf
Автор

Great explanation!! Only issue is the quality of the mic.

richardschmalhofer
Автор

This is very nice tutorial I have seen so far, really helpful . please can u share the dataset?

anjalipriya
Автор

Thank you for a nice introduction to this topic. I just want to warn you that you make a mistake when you create the train and test data sets. They are equal, which explains the good fit when you compare the predicted and the actual values. Maybe an idea to write the correct code in the description field. Anyway, you taught me how to do this. Thanks!

snorrefjeldbo
Автор

In the "Finding Accuracy" Portion


Per your dialogue, you indicate that you want to subtract each pair and square each one THEN take the mean THEN take the square root


However, the way you typed it, wouldn't it find the mean of all of the differences and THEN square that?
As in, wouldn't mean(pred - sales$Revenue)^2 take the mean of all the subtractions and then square it? As opposed to: mean( (pred-sales$Revenue)^2 )


?

zacharyboeder
Автор

what are the numbers just below the predicted values after we run the pred function?

omkarrege
Автор

After splitting the sales data, both train and test datasets have 1000 observations instead of 700 and 300 respectively meaning the model is trained and tested on the same dataset.

erickizambo
Автор

Many thanks for interesting tutorial! I was wondering whether sample.split() was used correctly, can you please check with setequal() if train and test aren’t the same dataset.

yerkebulankambarov
Автор

simply and super tutorial... please share dataset please....

THE.fatle.drawer
Автор

Thank you so much for the tutorial. Can I have the dataset so I can practice on my own

taruneeshsachdeva
Автор

Share more videos on when to apply which algorithm?

ajaykasanna
Автор

Thank you so much for this video.. learned alot

fathialwosaibi