Advancing Spark - Getting hands-on with Delta Cloning

preview_player
Показать описание
Last week we looked at the announcements for Databricks Runtime 7.2 and got all excited about the notes for Delta Cloning - but we had some really good questions raised about exactly what happens under the hood. So this week join Simon as he takes a bit of a dive into DEEP and SHALLOW cloning with Delta on Databricks.

Рекомендации по теме
Комментарии
Автор

Very nice video! Keep up the good work!
I am in the process of learning and your videos are excellent. I can only hope you will continue to upload new interesting stuff.
Thank you!

lucian
Автор

Hey Simon, thanks a lot for this video. A question: how would you then make live the clone version to become the original one? Thanks

nikkaz
Автор

Hey Simon Thanks for a great video. Just the kind of channel I was looking for. A quick question I am wondering what is the best way to copy only certain partitions of a delta table and create a new delta table without having to copy all the contents. I assumed cloning would help somehow, but does not seem the case.

prashanthxavierchinnappa
Автор

Nice video -yet- again Simon!
I really appreciate how you take the time to show all the manipulations and even the bugs ;)
Seems like a cool feature but I'm wondering how it would fare if I am cloning a huge table 70-140M of rows? Maybe some stress-test would be needed on my side :)
On the light side, please don't zoom on your face too often I get mesmerized by your eyes (are they blue-green) and I need to replay the parts multiple times :D HAHAHAHA#GirlProblems

the.activist.nightingale
Автор

Thanks Simon. I can see the use case of this in DR scenario where primary and secondary regions in ADLS or Blob is doing asynochrnous copy of data and thus make delta tables corrupted! Does DEEP CLONE happens with ACID guarantees. What if you are CLONING big tables and there is an interrpution to the cloning operation. Does it land incomplete data?

bhaveshpatelaus
Автор

Shallow Clone : What happens to the cloned table if we update on the original table. As we understand the initial pointer of the cloned table is towards the original table data. Thanks

sid