CCA 175 Real Time Exam Scenario 4 | Read CSV file | Write as TSV in HDFS with LZ4 Compression

preview_player
Показать описание
Data Description
1. All the categories records are stored at
/user/spark/dataset/retail_db/categories
2. Data is in text format
3. Data is comma separated

Output Requirement
1. Convert data into tab delimited file
2. Use text format for the output files
3. Place the result data in HDFS directory /user/spark/dataset/result/scenario4/solution
4. Compress the output using lz4 compression

Download the sample data from our Github repository.

🔵 COMPLETE APACHE SPARK TUTORIAL PLAYLIST 🔵

🔵 WORKING WITH STRUCTURED DATA IN APACHE SPARK 🔵

🔵 WORKING WITH DATE COLUMNS IN APACHE SPARK 🔵

🔵 WORKING WITH WINDOWING, AGGREGATE FUNCTIONS IN APACHE SPARK 🔵
Рекомендации по теме
Комментарии
Автор

Is it possible to add a header when writing to a simple text file?

ivanduque
Автор

I am getting this error "Caused by: java.lang.RuntimeException: native lz4 library not available" .

mattanishanth