A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets - Jules Damji

preview_player
Показать описание

Session hashtag: #EUdev12"

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.

Connect with us:
Рекомендации по теме
Комментарии
Автор

Amazing talk, I left off Spark to move in to ML when there was only RDD, I came back and see DataFrame in Spark and I am totally confused, your video helped a lot, Thank you

aliwaheed
Автор

What an amazing talk! Crisp and Clear! truly impressed.

tdkboxster
Автор

Now i know about RDDs, DataFrames and Datasets. Thanks for explaining it more precisely. Appreciated.

techoral
Автор

Thanks for in-depth explaining RDD DF And DS...

rahultiwari
Автор

this is a brilliant and fluid explanation

ctriz
Автор

Amazing presentation. Very intuitive..Thanks Boss!

puja
Автор

Amazing talk! very well explained indeed.

AllForLove
Автор

it was very insightful, such talks really helps developer why/how one should use structure API

shemantkr
Автор

Thanks for the video. Very understandable!

TheTambourinist
Автор

Only 300 likes for such an informative, crystal clear talk??

anibaldk
Автор

I had a nice learning time thanks for the talk!

sayandbhattacharya
Автор

I am wondering how the "type safe" feature combines with the "unstructured data" that is the nature of data in the systems that spark would be used in.

nareshgb
Автор

I was trying out the example you mentioned @10:46 and as i am getting compile time error, I had to rewrite the final statement as below.


parsedRdd.filter( content => content._2 == "en").map(filteredContent => => printf(s"$reducedContent._1: $reducedContent._2"))


I would really appreciate if you can review above statement

varundosapati
Автор

Thanks!
Can you attach the links here?

meravchkroun
Автор

This was amazing! Pretty well explained!
Thanks!

Chris_zacas