CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

Показать описание

Data Description
1. All the category records are stored at /user/spark/dataset/retail_db/categories
2. Data is in text format comma separated

Output Requirement
1. create a metastore table named 'categories_parquet'
2. Table should only contain category_id, category_name
3. Save all categories in metastore table categories_parquet
4. Use parquet format for the output files

Download the sample data from our Github repository.

Clear CCA 175 in first attempt using our dumps : Apache Spark and Hadoop Developer Certification

Complete CCA 175 Tutorial: Apache Spark and Hadoop Developer Certification

Proedu
cca 175
parquet file

Рекомендации по теме

Комментарии

Hello Proedu, is it mandatory to create the table in spark sql. ?
Can I only used dataframe api ? like ....write.format("hive").option("fileFormat", ? Thanks for your feedback.

lfriboulet

CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

CCA 175 Real Time Exam Scenario 12 | Read PARQUET Data | Save as JSON with Snappy Compression

CCA 175 Real Time Exam Scenario 5 | Read AVRO data | Write PARQUET in HDFS with SNAPPY Compression

CCA 175 Real Time Exam Scenario 7 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 13 | Read Hive Table | Write as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 11 | Read AVRO Data | Write as Tab Separated Value bzip2 compression

CCA 175 Real Time Exam Scenario 10 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 17 | JOIN Multiple DataFrames | Save as JSON and DEFLATE Compression

CCA 175 Real Time Exam Scenario 1 | Read Tab Delimited File | Write as CSV in HDFS

CCA 175 Real Time Exam Scenario 15 | Read CSV Data | JOIN Multiple DataFrames | Save as CSV

CCA 175 Real Time Exam Scenario 18 | JOIN Multiple DataFrames, AGGREGATE and SORT data| Save as ORC

CCA 175 Real Time Exam Scenario 6 | Read Hive table | Write as PARQUET in HDFS with GZip Compression

CCA 175 Real Time Exam Scenario 3 | Read Tab Delimited File | Write as ORC with SNAPPY Compression

CCA 175 Real Time Exam Scenario 16 | Read CSV | Save as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 20 | JOIN Multiple DataFrames | Save as PARQUET | SNAPPY Compression

CCA 175 Real Time Exam Scenario 9 | Read AVRO Data | Write as JSON in HDFS

CCA 175 Real Time Exam Scenario 14 | Read Tab Separated Values | Save PARQUET with GZIP compression

CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

CCA 175 Real Time Exam Scenario 4 | Read CSV file | Write as TSV in HDFS with LZ4 Compression

CCA 175 Real Time Exam Scenario 2 | Read Parquet File | Write as JSON in HDFS with GZIP Compression

CCA 175 Video

CCA 175 Real Time Exam Scenario 19 | Read CSV | AGGREGATE | RANK | Save as TEXT Pipe Delimited

CCA 175 - Hadoop & Spark Developer Certification | Cloudera CCA 175 Exam | Intellipaat

CCA 175 Certification Preparation Strategy

CCA 175 - Exam Taking Tips - Determining Spark Cluster and Memory Configuration