Spark Creating Dataframes | Spark-SQL | Session-11.2

preview_player
Показать описание
Hi Friends
From this session we are starting with Spark-SQL module, with this, a very important topic of the difference between Dataset and Dataframe, comes to picture, which creates quite a confusion in our minds. In this video I have explained the difference and its importance in a very concise way. Hope you will clear you Dataframe vs Dataset confusion after watching this video.
-Arpit
GKCodelabs
Рекомендации по теме
Комментарии
Автор

One urge from my end: Please make a video on DB connections (MongoDb and accessing HBase/Hive tables) in spark as well. Will highly appreciate that. TYVM

kapilchhatwani
Автор

Hi Arpit !your tutoring nicely. I have an urge; please make a video on how to handle json data. handling means how to stringify, normalise and then transform it so that we can use it with ease. Please let us also know the possible solutions to handle those jsons as well as to deal with the corrupted json data(say extra spaces in json object or some special character into that.) Hoping for a reply back :). Thank you

vikky
Автор

Hi Arpit, please advice in this regard. i have 100 csv files. i run a query using spark sql over each csv file to get the count of a column in a dataframe. i want the sum of this count value(dataframe value) of all 100 csv files. Thanks.

parshuramk