CCA 175 Real Time Exam Scenario 1 | Read Tab Delimited File | Write as CSV in HDFS

Показать описание

Data Description
1. All the customer records are stored in the HDFS directory
/user/spark/dataset/retail_db/customers-tab-delimited
2. Data is in Text format
3. Data is Tab delimited

Output Requirement
1. Output all the customers who live in California
2. Use text format for the output files
3. Place the result data in /user/spark/dataset/result/scenario1/solution
4. Result should only contain records that have state value as "CA"
5. Output should only contain customer's full name
Example: Robert Hudson

Download the sample data from our Github repository.

🔵 COMPLETE APACHE SPARK TUTORIAL PLAYLIST 🔵

🔵 WORKING WITH STRUCTURED DATA IN APACHE SPARK 🔵

🔵 WORKING WITH DATE COLUMNS IN APACHE SPARK 🔵

🔵 WORKING WITH WINDOWING, AGGREGATE FUNCTIONS IN APACHE SPARK 🔵

Рекомендации по теме

Комментарии

Hi Proedu, Is providing the alias name necessary for this question?

arindamnath

Thanks. The questions for the real exam are similar or more difficult ?

lfriboulet

CCA 175 Real Time Exam Scenario 1 | Read Tab Delimited File | Write as CSV in HDFS

CCA 175 Real Time Exam Scenario 12 | Read PARQUET Data | Save as JSON with Snappy Compression

CCA 175 Real Time Exam Scenario 5 | Read AVRO data | Write PARQUET in HDFS with SNAPPY Compression

CCA 175 Real Time Exam Scenario 7 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 11 | Read AVRO Data | Write as Tab Separated Value bzip2 compression

CCA 175 Real Time Exam Scenario 13 | Read Hive Table | Write as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 10 | Read CSV File | Write in HIVE Table

CCA 175 Real Time Exam Scenario 17 | JOIN Multiple DataFrames | Save as JSON and DEFLATE Compression

CCA 175 Real Time Exam Scenario 1 | Read Tab Delimited File | Write as CSV in HDFS

CCA 175 Real Time Exam Scenario 15 | Read CSV Data | JOIN Multiple DataFrames | Save as CSV

CCA 175 Real Time Exam Scenario 18 | JOIN Multiple DataFrames, AGGREGATE and SORT data| Save as ORC

CCA 175 Real Time Exam Scenario 6 | Read Hive table | Write as PARQUET in HDFS with GZip Compression

CCA 175 Real Time Exam Scenario 3 | Read Tab Delimited File | Write as ORC with SNAPPY Compression

CCA 175 Real Time Exam Scenario 20 | JOIN Multiple DataFrames | Save as PARQUET | SNAPPY Compression

CCA 175 Real Time Exam Scenario 16 | Read CSV | Save as PARQUET with SNAPPY Compression

CCA 175 Real Time Exam Scenario 9 | Read AVRO Data | Write as JSON in HDFS

CCA 175 Real Time Exam Scenario 8 | Read CSV File | Write in HIVE Table with PARQUET File Format

CCA 175 Real Time Exam Scenario 14 | Read Tab Separated Values | Save PARQUET with GZIP compression

CCA 175 Real Time Exam Scenario 2 | Read Parquet File | Write as JSON in HDFS with GZIP Compression

CCA 175 Real Time Exam Scenario 4 | Read CSV file | Write as TSV in HDFS with LZ4 Compression

CCA 175 Real Time Exam Scenario 19 | Read CSV | AGGREGATE | RANK | Save as TEXT Pipe Delimited

CCA 175 Video

CCA 175 Certification Preparation Strategy

CCA175 Practice Test | Open Test Questions & SparkShell

CCA 175 - Hadoop & Spark Developer Certification | Cloudera CCA 175 Exam | Intellipaat