CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

preview_player
Показать описание
Data Description
1. All the category records are stored at /user/spark/dataset/retail_db/categories
2. Data is in text format comma separated

Output Requirement
1. create a metastore table named 'categories_parquet'
2. Table should only contain category_id, category_name
3. Save all categories in metastore table categories_parquet
4. Use parquet format for the output files

Download the sample data from our Github repository.

Clear CCA 175 in first attempt using our dumps : Apache Spark and Hadoop Developer Certification

Complete CCA 175 Tutorial: Apache Spark and Hadoop Developer Certification
Рекомендации по теме
Комментарии
Автор

Hello Proedu, is it mandatory to create the table in spark sql. ?
Can I only used dataframe api ? like ....write.format("hive").option("fileFormat", ? Thanks for your feedback.

lfriboulet