How to find out delimiter Dynamically in csv files? | Databricks Tutorial | PySpark | Automation |

preview_player
Показать описание
If you like this video please share and subscribe to my channel.

Full Playlist of Interview Questions of SQL:
Full Playlist of Snowflake SQL:
Full Playlist of Golang:
Full Playlist of NumPY Library:
Full Playlist of PTQT5:
Full Playlist of Pandas:

#hastag #tags
hashtag
tags
#databricks #pyspark #delimiter #json
Рекомендации по теме
Комментарии
Автор

No need for regex. Import csv instead. The following code will catch all delimeters:

import csv

def getDelimeter(file_path):
try:
headerlist =
header_str= str(headerlist)
dialect =
return dialect.delimiter
except Exception as e:
print("Error Occurred ", str(e))

bermagot
Автор

In our project they have written custom code in UDF for scd type 1 and 2, scd type 2 soft delete. Am not able to understand that code. If they mentioned in jira it is full load or incremental load. We need work on that. How it dynamically knows it's whether it is a full or incremental. Can you please explain bro.

sravankumar
Автор

Bro, under try block you have defined sc.textFile( ). Can you please shed some light what is "sc" .?

nitinpandey
Автор

i get an error 'SparkSession' object has no attribute 'textFile'
I already have spark session configured in my file

encryptedunlimited