CCA 175 Real Time Exam Scenario 19 | Read CSV | AGGREGATE | RANK | Save as TEXT Pipe Delimited

Показать описание

Data Description
All the product records are stored at /user/spark/dataset/retail_db/products
All the category records are stored at /user/spark/dataset/retail_db/categories
All the Order Items records are stored at /user/spark/dataset/retail_db/order_items
Data is in text format

Output Requirement
Get top five best selling product in "Accessories" category
Place the result data in HDFS directory /user/spark/dataset/result/scenario19/solution
Save the output as text file
Use "|" as field separator
Output data should contain columns category_name,product_name,product_revenue

Download the sample data from our Github repository.

🔵 COMPLETE APACHE SPARK TUTORIAL PLAYLIST 🔵

🔵 WORKING WITH STRUCTURED DATA IN APACHE SPARK 🔵

🔵 WORKING WITH DATE COLUMNS IN APACHE SPARK 🔵

🔵 WORKING WITH WINDOWING, AGGREGATE FUNCTIONS IN APACHE SPARK 🔵

Рекомендации по теме

Комментарии

Best resource to clear the CCA 175 exam in the first attempt. Thanks Proedu.

dasariprasad

It's definitely boost confidence, it's almost close to the real exam, this is enough to clear the exam

anilmnt

Hi,

is it necessary to use the rank() function?

I get the same result using Group by, Order by and limit 5.

select c.category_name, p.product_name, sum(oi.order_item_subtotal) as product_revenue
from p join oi on p.product_id = oi.order_item_product_id join c on p.product_category_id = c.category_id
where c.category_name = 'Accessories'
group by c.category_name, p.product_id, p.product_name
order by product_revenue desc limit 5

rajantawade

Did any one here write test recently? Will any one monitor exam while we are taking test?do we need to screen share?

rajareddy

CCA 175 Real Time Exam Scenario 19 | Read CSV | AGGREGATE | RANK | Save as TEXT Pipe Delimited

CCA 175 Real Time Exam Scenario 12 | Read PARQUET Data | Save as JSON with Snappy Compression

CCA 175 Real Time Exam Scenario 5 | Read AVRO data | Write PARQUET in HDFS with SNAPPY Compression

CCA 175 Real Time Exam Scenario 7 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 13 | Read Hive Table | Write as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 11 | Read AVRO Data | Write as Tab Separated Value bzip2 compression

CCA 175 Real Time Exam Scenario 10 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 17 | JOIN Multiple DataFrames | Save as JSON and DEFLATE Compression

CCA 175 Real Time Exam Scenario 1 | Read Tab Delimited File | Write as CSV in HDFS

CCA 175 Real Time Exam Scenario 15 | Read CSV Data | JOIN Multiple DataFrames | Save as CSV

CCA 175 Real Time Exam Scenario 18 | JOIN Multiple DataFrames, AGGREGATE and SORT data| Save as ORC

CCA 175 Real Time Exam Scenario 6 | Read Hive table | Write as PARQUET in HDFS with GZip Compression

CCA 175 Real Time Exam Scenario 3 | Read Tab Delimited File | Write as ORC with SNAPPY Compression

CCA 175 Real Time Exam Scenario 20 | JOIN Multiple DataFrames | Save as PARQUET | SNAPPY Compression

CCA 175 Real Time Exam Scenario 16 | Read CSV | Save as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 9 | Read AVRO Data | Write as JSON in HDFS

CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

CCA 175 Real Time Exam Scenario 14 | Read Tab Separated Values | Save PARQUET with GZIP compression

CCA 175 Real Time Exam Scenario 4 | Read CSV file | Write as TSV in HDFS with LZ4 Compression

CCA 175 Real Time Exam Scenario 2 | Read Parquet File | Write as JSON in HDFS with GZIP Compression

CCA 175 Video

CCA 175 Real Time Exam Scenario 19 | Read CSV | AGGREGATE | RANK | Save as TEXT Pipe Delimited

CCA 175 - Hadoop & Spark Developer Certification | Cloudera CCA 175 Exam | Intellipaat

CCA 175 Certification Preparation Strategy

CCA 175 - Exam Taking Tips - Determining Spark Cluster and Memory Configuration