Dealing Multi source CSV file in Spark SQL | json_tuple | CSV & JSON in a single file | using Scala

Показать описание

Hi Friends,
In today's video, i have explained about how to flatten a csv file using Scala, if the CSV file contains both comma delimited and JSON formats.

Please subscribe to my channel and provide your feedback in the comments section.

Рекомендации по теме

Комментарии

Useful information and presented in details, thanks Sravana

vasudeorane

Hello .. Thank you for this video .. I m looking for similar scenario, i have huge data in csv file both comma delimited and JSON formats and i m not sure column details in json formats .. So how to derive column name and assign it dynamically or any other option is available.. Thanks in advance

suprajar

Can you please provide the sample data ?

mondritaroy

Hi Sravana
i am getting null values when using from_json, can you help me figure out the missing piece here . TY
~ input is the .csv file with json e.g.
id, request
1, {"Zipcode":704, "ZipCodeType":"STANDARD", "City":"PARC PARQUE", "State":"PR"}
2, {"Zipcode":704, "ZipCodeType":"STANDARD", "City":"PASEO COSTA DEL SUR", "State":"PR"}
~my code (scala/spark)
val input_df = spark.read.option("header", true).option("escape", "\"").csv(json_file_input)
val json_schema_abc = StructType(Array(
StructField("Zipcode", IntegerType, true),
StructField("ZipCodeType", StringType, true),
StructField("City", StringType, true),
StructField("State", StringType, true))
)
val output_df = input_df.select($"id", from_json(col("request"), json_schema_abc).as("json_request"))
.select("id", "json_request.*")

thesadanand

can you please provide the sample data & source code (github link should do ) ~ty

thesadanand

Dealing Multi source CSV file in Spark SQL | json_tuple | CSV & JSON in a single file | using Scala

Dealing Multi source CSV file in Spark SQL | json_tuple | CSV & JSON in a single file | using Sc...

Extract Data From Multiple Files | Python CSV File Handling

Append Records From Multiple CSV Files To Master CSV File With pandas

11 Tips and Tricks CSV file single column with multiple commas

7- Master Data Importing in Python: CSV, Excel, PostgreSQL, and MongoDB Step-by-Step Tutorial

2.Extract Data from Different Sources | CSV | Excel |Multiple CSV files | PDF file| into Power BI

How to Combine Multiple CSV Files into a Single Excel File

Extract data from multiple CSV file to Excel using python automation

How to Read Multiple CSV Files in Python | For-Loop + 2 More

Opening .CSV Files with Excel - Quick Tip on Delimited Text Files

Consolidating CSV Files: A Complete Guide to Creating a Master File From Multiple Sources

Combining Multiple CSV file to One and Visualization using Pivot : Nifty 50 Rate of Return Analysis

Get Multiple Files Containing Multiple Sheets with Power Query

78_Python Masterclass: Combining Multiple CSV Files into One Efficiently part two

Turning multiple CSV files into a single pandas data frame

How to Read Multiple CSV Files with Different Number of Columns using PySpark

How to Parse CSV Files into Multiple CSVs Based on Row Spacing

Combine Multiple CSV File | Step by Step Python Code | Unix/Linux | Visual Studio Code | Big Data

SSIS CSV File Source - Read multiple CSV files, Import Gzip, Extract REST API data

Extracting Middle Rows from Multiple CSV Files into a Single DataFrame in Python

How to Efficiently Add a name Column to Multiple CSV Files Using Python

Python - Scopes, multi-threading, socket programming, logging, handling CSV file

How to Effectively Change Variable Names and Loop Through Multiple CSV Files in R

Efficiently Combine CSV Files in C+ + Without Duplicating Headers