CCA 175 Real Time Exam Scenario 12 | Read PARQUET Data | Save as JSON with Snappy Compression

Показать описание

Data Description
All the order records are stored at
/user/spark/dataset/retail_db/orders_parquet
Data is in parquet format

Output Requirement
✔️ Output all the PENDING orders in July 2013
✔️ Use JSON format for the output files
✔️ Place the result data in HDFS directory /user/spark/dataset/result/scenario11/solution
✔️ Result should only contain records that have order_status value as "PENDING"
✔️ order_date should be in format yyyy-MM-dd
✔️ Compress the output using snappy compression and output should only contain order_date, order_status

Download the sample data from our Github repository.

🔵 COMPLETE APACHE SPARK TUTORIAL PLAYLIST 🔵

🔵 WORKING WITH STRUCTURED DATA IN APACHE SPARK 🔵

🔵 WORKING WITH DATE COLUMNS IN APACHE SPARK 🔵

🔵 WORKING WITH WINDOWING, AGGREGATE FUNCTIONS IN APACHE SPARK 🔵

Рекомендации по теме

Комментарии

Error while writing dataframe using snappy

abhigyapranshu

CCA 175 Real Time Exam Scenario 12 | Read PARQUET Data | Save as JSON with Snappy Compression

CCA 175 Real Time Exam Scenario 12 | Read PARQUET Data | Save as JSON with Snappy Compression

CCA 175 Real Time Exam Scenario 7 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 5 | Read AVRO data | Write PARQUET in HDFS with SNAPPY Compression

CCA 175 Real Time Exam Scenario 11 | Read AVRO Data | Write as Tab Separated Value bzip2 compression

CCA 175 Real Time Exam Scenario 13 | Read Hive Table | Write as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 10 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 17 | JOIN Multiple DataFrames | Save as JSON and DEFLATE Compression

CCA 175 Real Time Exam Scenario 1 | Read Tab Delimited File | Write as CSV in HDFS

CCA 175 Real Time Exam Scenario 15 | Read CSV Data | JOIN Multiple DataFrames | Save as CSV

CCA 175 Real Time Exam Scenario 18 | JOIN Multiple DataFrames, AGGREGATE and SORT data| Save as ORC

CCA 175 Real Time Exam Scenario 6 | Read Hive table | Write as PARQUET in HDFS with GZip Compression

CCA 175 Real Time Exam Scenario 3 | Read Tab Delimited File | Write as ORC with SNAPPY Compression

CCA 175 Real Time Exam Scenario 16 | Read CSV | Save as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 20 | JOIN Multiple DataFrames | Save as PARQUET | SNAPPY Compression

CCA 175 Real Time Exam Scenario 9 | Read AVRO Data | Write as JSON in HDFS

CCA 175 Real Time Exam Scenario 14 | Read Tab Separated Values | Save PARQUET with GZIP compression

CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

CCA 175 Real Time Exam Scenario 2 | Read Parquet File | Write as JSON in HDFS with GZIP Compression

CCA 175 Real Time Exam Scenario 4 | Read CSV file | Write as TSV in HDFS with LZ4 Compression

CCA 175 Real Time Exam Scenario 19 | Read CSV | AGGREGATE | RANK | Save as TEXT Pipe Delimited

CCA 175 - Hadoop & Spark Developer Certification | Cloudera CCA 175 Exam | Intellipaat

CCA 175 Video

CCA 175 Certification Preparation Strategy

CCA 175 - Certification Hadoop & Spark Developer | Cloudera CCA 175 Exam Description | Edureka