14 Read, Parse or Flatten JSON data

preview_player
Показать описание
Video explains - How to read JSON files? How to parse JSON data? How to flatten JSON data? What is explode function? What is from_json function ? What is to_json function ? How to write complex schema for JSON ?

Chapters
00:00 - Introduction
02:01 - Read Single Line JSON file
03:29 - Read Multiline JSON file
04:42 - Read JSON data in Single column
05:29 - Read JSON file with Schema
07:00 - Write Schema ddl String
09:20 - from_json function
11:00 - to_json function
12:39 - Flatten JSON data

The series provides a step-by-step guide to learning PySpark, a popular open-source distributed computing framework that is used for big data processing.

New video in every 3 days ❤️

#spark #pyspark #python #dataengineering
Рекомендации по теме
Комментарии
Автор

Buddy, you rock! How come that we have only 2K views? Best tutorial on the tube!

adulterrier
Автор

Thanks for this topic : Flatten JSON data

quazimoinuddin
Автор

Thanks, buddy. explode_outer is also an important concept.

rakeshpanigrahi
Автор

It's really great, I do have a question. What if the JSON is 5 to 6 level nested struct. How can we make it to tabular format?

souravnandy
Автор

Thanks very much for the tutorial :), I have a query regarding reading in json files.

so i have an array of structs where each struct has a different structure/schema.
And based on a certain property value of struct I apply filter to get that nested struct, however when I display using printschema it contains fields that do not belong to that object but are somehow being associated with the object from the schema of other structs, how can i possibly fix this issue ?

shreyaspatil