Solve using REGEXP_REPLACE and REGEXP_EXTRACT in PySpark

preview_player
Показать описание
Hello Geeks,

data=[(1,"Sagar-Prajapati"),(2,"Alex-John"),(3,"John Cena"),(4,"Kim Joe")]
schema="ID int,Name string"

If you want to build projects on Azure and Databricks, then check out the below courses

1. Delta-Lake using Databricks:

2. Azure Course:

3. Master in Python:

4. Git and Linux:

Join my telegram group
Telegram:

Follow me on Linkedin:

#interviewquestion #pysparkinterview
Рекомендации по теме
Комментарии
Автор

from pyspark.sql.functions import *
df1 = df.withColumn("FirstName", regexp_extract(col("Name"), r"(\w+)-? ?(\w+)", 1))\
.withColumn("LastName", regexp_extract(col("Name"), r"(\w+)-? ?(\w+)", 2))

df1.show()

Can we write like this?

ngozvze