Enabling Exploratory Analysis of Large Data with R and Spark

preview_player
Показать описание
From Seattle Spark Meetup 2/10/2016 @ FHCRC

In this video, Hossein will introduce SparkR and how it integrates the two worlds of Spark and R. He will demonstrate one of the most important use cases of SparkR: exploratory analysis of very large data. Specifically, he will show how Spark’s features and capabilities, such as caching distributed data and integrated SQL execution, complement R’s great tools such as visualization and diverse packages in a real world data analysis project with big data.
Рекомендации по теме
Комментарии
Автор

This is fascinating! I always thought that I need my data to naturally be in RDD form to use SparkR. But now I see how it is. Thanks a ton.

nimamaleki
Автор

Great talk, thanks so much for giving it and uploading it!

rescuemay
Автор

Thank you for sharing the video. Very helpful.

Annemariew
Автор

Hi, do we have a function to read netcdf file in sparkR or sparklyR? Like the nc_open() in R.

rajanikumar
Автор

How do I get the same frontend that you are using?

sahilsareen
Автор

Hi, I am having problem in reading the HIVE Tables from SparkR. I did the following but getting "Table Not Found" error. Could anyone help please?

sc <- sparkR.init()
sqlContext <- sparkRSQL.init(sc)
CLF <- sql(sqlContext, "SELECT * FROM LIMIT 5")

moloyde
Автор

I have sparkR and R installed on my Ubuntu machine. When I try to launch sparkR through the terminal, I get the following error:
Error in eval(expr, envir, enclos) :
could not find function ".getNamespace"
Error: unable to load R code in package ‘SparkR’
During startup - Warning message:
package ‘SparkR’ in options("defaultPackages") was not found
Could you please help me fix this?

pratikshirbhate
Автор

Is spark R a separate piece of software, or is it only part of spark?

soundsandambientvideos