filmov
tv
Speeding up computations in R with parallel programming in the cloud

Показать описание
There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and grid-based computations are just a few examples. In this talk, I'll provide a review of tools for implementing embarrassingly parallel computations in R, including the built-in "parallel" package and extensions such as the "foreach" package. I'll also demonstrate how you can dramatically reduce the time for a complex computation -- optimizing hyperparameters for a predictive model with the "caret" package -- by using a cluster of parallel R session in the cloud. With the "doAzureParallel" package, I'll show how you can create a cluster of virtual machines running R in Azure, parallelize the problem by registering backend to "foreach", and shut down the cluster when the computation is complete, all with just a few lines of R code.