Running R code in parallel using parallel::clusterApply()

Показать описание

R code is often quick to write, but not always quick enough to run. One strategy to speed up runtimes is to parallelize code. Here, we create 200 regression models using 200 different predictors - a task well suited for parallelization.

First, we set up the workers using makeCluster(). Next, we create a function that takes a predictor as input and returns a model summary. Then we can create all 200 models with a simple one-liner using lapply(). To parallelize, we have to overcome a small challenge, namely provide the workers with data using clusterExport(). Then we can simply exchange lapply() for clusterApply() to run our code in parallel.

The bench::mark() function shows the speed improvement that gave us.

Code can be found here:

All the best for speeding up your R code!

Thumbnail image: Chait Goli from Pexels

Contact me, e. g. to discuss (online) R workshops / trainings / webinars:

Playlist: Music chart history

Рекомендации по теме

Комментарии

I've used parallel::detectCores() a lot, taught it in workshops, and also used it in this video. However, a number of serious problems may arise from using this function. A better alternative is parallelly::availableCores(). Thanks to Henrik Bengtsson!

See this newer video:
Why You Should NOT use parallel::detectCores() in R

StatistikinDD

Great Video, glad I found your channel

MsBainy

"If you have ssh installed, you can specify a list of machines for the first argument:

cl <- makeCluster(c("n1", "n2", "n3", "n4"))".

How do I get the name of the machines? to build the list.

lucianomaldonado

Running R code in parallel using parallel::clusterApply()

Running R code in parallel using parallel::clusterApply()

Running R Code in Parallel: What if run times differ? clusterApplyLB

How to run your R code in parallel with the furrr package (CC127)

R Tutorial: R packages for parallel computing

Speeding up computations in R with parallel programming in the cloud

R Tutorial: Parallel Programming in R

Parallel Computing in R

R Tutorial: Models of parallel computing

R : Run R code in parallel in a shell without having R file

R-Code parallelisieren mit parallel::clusterApply()

Why You Should NOT use parallel::detectCores() in R

Running loops in parallel in R using foreach

Parallel Analysis in R

Boriana P. Pratt - Running simulations in Parallel in R with doParallel package

Henrik Bengtsson | Future: Simple Async, Parallel & Distributed Processing in R | RStudio (2020)

How to see the progress of a parallel R run in real time

Parallel and high performance computing with R

Machine Learning in R: Speed up Model Building with Parallel Computing

Running parallel code in R/python without using libraries in Windows

Progress Bars and Parallel Execution in R: progressr and future

R Tutorial: Parallel Analysis

Parallel Programming in R and Python

Henrik Bengtsson - Future - Simple, Friendly Parallel Processing for R [Remote]

Run Mutiple R scripts in Parallel in Pentaho