Data Collab Lab: Automate Data Pipelines with PySpark SQL

preview_player
Показать описание
Рекомендации по теме
Комментарии
Автор

Very good tutorial, very useful. I am wondering if you have a index page which list all such demos for databricks so I can go through all of such demos.

StoneZhong
Автор

Hi Denny,

Between Merge (update, insert) and Update then Insert, which approach perform better? or both execute with identical performance?

leodixx
Автор

Hi Thanks for this session, as I am very much new to data bricks and python. I have one question here, In video you shown run for only one json file. If I change the 2 parameter which is filepath and table name will that work in similar fashion ?

vipinkumarjha
Автор

It throwing index out of range error. After regex when i try to do
schema_for_table = pattern[0] + ', ' + pattern[1]

vamshi
Автор

Thanks, where can we find the notebook and scripts?

shokoufehabrishami
Автор

Thanks for this webinar, sorry where can i get the notebook? Thanks

vthamilventhan
Автор

Spoiler alert : upsert can be performed in databricks delta 🙂

dipanjansaha
Автор

I know Python and MySQL. To become a Big Data Engineer, what do I need to learn about Spark?

AminulIslam-odnp