How to read write Parquet file data in Apache Spark | Parquet | Apache Spark

preview_player
Показать описание
#Apache #Spark #CCA175 #Parquet
In this video we will learn how to work with Parquet file format in Apache Spark

⏰TIMESTAMPS
00:00 Objectives
00:25 What is Parquet file format
01:13 How to read Parquet file as a Dataframe in Apache Spark
01:55 How to apply filter function in Spark dataframe
02:50 How to select few columns from the Dataframe
03:33 How to save Dataframe to HDFS in parquet file format
06:22 How to save Dataframe to HDFS in parquet file format with gzip compression

✔️ DOWNLOAD PRACTICE DATASET

🔵 COMPLETE APACHE SPARK TUTORIAL PLAYLIST 🔵

🔵 WORKING WITH STRUCTURED DATA IN APACHE SPARK 🔵

🔵 WORKING WITH DATE COLUMNS IN APACHE SPARK 🔵

🔵 WORKING WITH WINDOWING, AGGREGATE FUNCTIONS IN APACHE SPARK 🔵
Рекомендации по теме
Комментарии
Автор

very nicely explained..thanks for the content

nilgiripaiya
Автор

Thank you so much! Very good explanation!

venkatk
Автор

Hi, I have searched online a lot but for partitionBy example I see same example of country or gender.if a column name Brand having elements like- H & M, Zara, Tommy Hilfiger can be partitioned in scala?And the other columns being description, price, index all in CSV format.Please suggest if i have to do something additionally to csv file before converting it into parquet file is the error i am getting->error: Caused by: Task failed while writing rows. error:Caused by: java.io.IOException: Mkdirs failed to create file:/path

vilw
Автор

Why there is 2 partition . What is by default partition when we write file .?

pratapranvir
Автор

Can you have idea same thing from Java

vamshikrishna