Dummy Variables or Indicator Variables in R | R Tutorial 5.5 | MarinStatsLectures

preview_player
Показать описание
Dummy Variables or Indicator Variables in R: How are dummy or indicators variables used to include categorical variables in a regression model in R

In this R video tutorial, we learn what dummy or indicator variables are and how they are used to include categorical or qualitative variables or factors into a regression model in R.
To consider a categorical variable as a predictor in a regression model, we create indicator variables to represent the categories that are not the reference.

Note: If you include a categorical variable into a regression model in R, R will automatically create dummy variables for it. For example, if X1 is a numeric variable and X2 is a categorical variable with 4 levels, if you fit a linear regression model lm(Y ~ X1 + X2), R will automatically create the necessary 3 dummy variables for X2 when fitting the model.

The video provides a tutorial for programming in R Statistical Software for beginners.

■Table of Content:

0:00:11 Categorical or qualitative variables (AKA factors) can be included in a regression model using dummy or indicator variables
0:00:31 Introducing the categorical variable example in R (CatHeight)
0:00:50 How many dummy or indicator variables are needed to represent a categorical variable in a regression model?
0:01:37 How to create dummy or indicator variables for a categorical variable with 6 levels with R programming language? (an example)
0:02:18 How to determine the reference or baseline group in R?
0:02:35 How an individual's height category is represented using dummy or indicator variables in R?
0:03:46 How to interpret the model coefficients for the dummy or indicator variables in R?
0:05:52 Why do we use dummy or indicator variables in a regression model in R?
0:06:00 How does R create the dummy or indicator variables in a regression model?
0:06:07 How does R choose the reference or baseline category?
0:06:19 How to change which category or level serves as the reference of baseline group in R?

► ► Watch More:

Follow MarinStatsLectures

Our Team:
Content Creator: Mike Marin (B.Sc., MSc.) Senior Instructor at UBC.
Producer and Creative Manager: Ladan Hamadani (B.Sc., BA., MPH)

These videos are created by #marinstatslectures to support some courses at The University of British Columbia (UBC) (#IntroductoryStatistics and #RVideoTutorials for Health Science Research), although we make all videos available to the everyone everywhere for free.

Thanks for watching! Have fun and remember that statistics is almost as beautiful as a unicorn!
Рекомендации по теме
Комментарии
Автор

Mike Marin is a highly respected & a distinguished lecturer in my heart! I learn R programming and statistics from you from zero beginning to advanced levels. I always watch and follow your videos to learn R step by step in office & now I'm able to use it at least for regression & others. I personally recommend such great videos for anyone interesting to learn statistics &R, particularly beginners. Please keep on uploading your new videos for refreshing our minds! Let great tanks for Mike Marin!Wish you all the best!

teshomeabebe
Автор

Just finished the whole series of videos. They are very helpful and easy to understand. Thank you, Mike! 

woaiyinan
Автор

Mike, you are teaching statistic in R way better than my prof in the University. thanx for the videos

MAMADGOOLI
Автор

Thank you thank you thank you thank you thank you! This is exactly what I needed. My data set includes Zip Code as a predictor and I had questions about how best to turn them into dummy variables.

SanDiegoFreddy
Автор

I have been learning a lot about R just from your series of VDOs. Please keep them coming because I really need these in my ecology research. Thank you so much.

amybaron
Автор

Hope there are more R programming videos are coming, great job!

woaiyinan
Автор

Thank you. This is a great help for statistics.

MrCigarro
Автор

Hi, Mr.Marin, I think you forget to mention that
CatHeight <- factor(CatHeight, order=T, levels=c("F", "E", "D", "C", "B", "A"))

kourosbear
Автор

Been waiting for this! Finally here :) Thanks!!!

princeadu
Автор

Excellent video, Mike - crystal clear! Thanks so much!

ChunLin_UoE
Автор

How to creat the catheight? Is there a quick way to do so in R? Many thanks.

CanDoSo_org
Автор

Hi Marin! Really excelent your explanations! Congratulations!

jeovanischmitt
Автор

Can not wait to see GLM course video on air.... Cheers

bramsetyadji
Автор

Hi there
This doesn't quite work for me...
I have a dependant variable SALES that I want to declare by some numerical variables (R&D, nr. of employees, ...) and a dummy variable for NACE (which divides companies in: High-Tech, Medium-High-Tech, Medium-Low-Tech and Low-Tech).
Does R read this string NACE-code immediately as categorical? And if not, how should I deal with this?
(Nice video btw)
Thanks in advance

arnovandroogenbroeck
Автор

Hi Sir,
I was wondering how did you change the data for smoke into "0" and "1"

Thank you.

yusriashifothmarani
Автор

These are so great! You're helping a biology grad student a great deal. 

Do you have any instructional videos on mixed and random effects models and ANCOVA ? 

seanpinnell
Автор

Hi, are these control variables? Is this how one would write the lm formula when controlling for several variables in R?

arehmankhn
Автор

hi thank you for your video. I have a doubt, instead using our model "lm{LungCap ~ CatHeight)" what result it gives when we use "lm{LungCap ~ factor(CatHeight))"?. I need the theoretical differences in both cases. Thank you.

rasinrs
Автор

Thank you sir. Got my concepts clear this time!

debapi
Автор

Super good sir.. Thank you soo much for sharing the videos..

batraamit