Class 59 - Different file formats - json, orc, parquet and avro

preview_player
Показать описание
As part of this topiclet us see different file formats supported Spark. File formats include csv, orc, parquet, avro etc.

* Different file formats and APIs associated with them
* Spark 2.x have support to these file formats out of the box - ison, parquet, orc, csv, text etc
* With JSON we will also see how we can process data with complex JSON objects such as
* JSON object spawning multiple lines
* JSON object which have nested JSON objects
* and more
* We can also use 3rd party APIs to read data from file formats such as Avro

Connect with me or follow me at
Рекомендации по теме
Комментарии
Автор

Hi Durga.
I wanted to know what is the exact work and difference between RDD, DataFrame and Dataset. Can you please explain or give me some URL to follow and understand in details.
Thank you

subhrasucharita
welcome to shbcf.ru