withColumn vs withColumns in Apache Spark| Databricks |

preview_player
Показать описание
Hey Geeks,

In this video, I discussed withcolumns method of spark which is available in spark 3.3.0.

If you are new to this playlist then please watch out the below playlist completely.

Full Playlist of Interview Questions of SQL:
Full Playlist of Snowflake SQL:
Full Playlist of Golang:
Full Playlist of NumPY Library:
Full Playlist of PTQT5:
Full Playlist of Pandas:


pyspark tutorial
azure data factory
pyspark
data engineer roadmap
azure databricks
databricks tutorial
pyspark tutorial for beginners

#azuredataengineer #withcolumns

#databricksforbeginner #databricks
Рекомендации по теме
Комментарии
Автор

Congratulations man from Veltech University to 13.5k subscribers on youtube, you came a very long way.

parthasaradhireddy
Автор

Can you please show an example with a delta table?

raskotha
Автор

New sub here, I like your vids. I have question for you though, spark.sql defaults back data types to string when using Group By, or any kinds of Join. Any idea why this happens?

wysiwydg
Автор

Hey How many videos left to complete pyspark

souvikghosh
Автор

There is literally zero evidence that the second one is faster. If you remove display, it will not be executed and you only create execution plan. The transformation itself will be executed only if you call 'Action' in pyspark.

And even if you call display or .collect() or show or count, you would still need to timeit with %timeit or you would still need to increase dataset size to prove that it is faster. I am not saying it is not, I am just saying that this video does not prove that.

PS: I know that in the video you call display() but simple the dataset is so small, that you will see differences in milliseconds which are not comparable. Try to run it with %timeit.

lubomirfranko