Cleansing the CSV data and processing in Pyspark| Scenario based question| Spark Interview Questions

preview_player
Показать описание
Hi Friends,
Sample code is checked into GitHub:

In this video, I have explained the procedure for reading a csv file and processing it using PySpark.
The CSV has multiple lines present for a single Id and has uneven columns ( different number of columns for each row).
Please subscribe to my channel for more interesting learnings.
Рекомендации по теме
Комментарии
Автор

Your tutorials are simply special Sravana!!

sudippandit
Автор

Superb, everyone can easily understand 👍 👏

sravankumar
Автор

amazing vide. Now i know where i am wrong. thx for the video.

deathseal
Автор

Please do more videos scenario based on pyspark .current project using pyspark we r doing transformations in ADB , adf only FOR data movement only.

sravankumar
Автор

@sparklingFuture
why cant we use pivot and filter data on top of it it will be single liner right?

shahids
Автор

Your videos are awesome with more advance approach but pls upgrade your audio system. Its request.. 🙏

akashbalmiki
Автор

can you please this scenario how to Load CSV file in to JSON with Nested Hierarchy using pyspark in ADB like custid, custname, itemname, quanity this csv when we convert to nested json custid, custname, purchases { itemname : book, quantity : 2} like one customer buy multiple items

sravankumar
Автор

hello...can you please confirm when you first extracted data from CSV where did you mention the column names. how did the column names generate in the show command

rajanib
Автор

How to Merge Spark DataFrame - Complex type if we have two json files json 1 schema and json2 schema is differenr how can we merge using pyspark. can you please explain this scenario.

sravankumar