Efficient Loopsing in R | Rstudio FOR Loop Tutorial

preview_player
Показать описание
Looping is one of the most common things we do in computer science and data science. Whether we are trying to plot the same type of graph over many many dataset or try to do apply the same function over all the columns, underneath many complicated high levels functions we are always try to do loopings. For here, I am trying to share some of the tips and tricks I found online and some of them are my personal best practices from my years of coding.

Github Repo

Github Pages

Multi threading in RStudio

Рекомендации по теме
Комментарии
Автор

0:45 What is a loop
3:18 Using list in a for loop
4:10 Break and Next in loop
6:30 Efficient Looping Practices
7:35 Don't Make object you don't use
8:35 Garbage Collection
10:35 Initialize object before loops
10:50 system.time()
11:10 Using simple data type
13:51 Declare size of output before the loop
15:35 Short Summary
16:05 Parallel Computing

PeihuiBrandonYeo
Автор

Great video!
One question about 14:59: Wouldn't it make sense to include the creation of P3 in the system.time statement for a better comparison? list() is probably faster than seq(1, (but of course not fast enough to overcompensate the multiple reallocation of the list length)

marioreutter
Автор

Hi! thank you for the explanation. so good!!!
one more thing, could you please show how to detect outliers with looping for more than 1 variables using boxplot.
I saw online this code using loop to detect outliers in boxplot.

#Check for outliers in data using boxplot
cnames = colnames(day[, c("actual_temp", "actual_feel_temp", "actual_windspeed", "actual_hum")])
for (i in 1:length(cnames))
{
assign(paste0("gn", i), ggplot(aes_string(y = cnames[i]), data = day)+
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(outlier.colour="red", fill = "grey", outlier.shape=18,
outlier.size=1, notch=FALSE) +

labs(y=cnames[i])+
ggtitle(paste("Box plot for", cnames[i])))
}
gridExtra::grid.arrange(gn1, gn3, gn2, gn4, ncol=2)

it was a little confusing. could you please explain it. thank you.

ricktikra
Автор

Hi Brandon, I was wondering:
How would
rm(x)
vs
x <- NULL
Compare in terms of computing speed and memory efficiency?
Thanks for your work!

sketchgamma
Автор

I thought you're not supposed to use for loops in R. Lapply and similar functions do same things much faster and most of the time when people use for they do the same thing mamy times independently.

kokainum