filmov
tv
Advancing Spark - Identity Columns in Delta
Показать описание
A classic challenge in Data Warehousing is getting your surrogate key patterns right - but without the same tooling, how do we achieve it in a Lakehouse Environment? We've had several patterns in the past, each with their drawbacks, but now we've got a brand new IDENTITY column type... so how does it size up?
In this video Simon does a quick recap of the existing surrogate key methods within Spark-based ETL processes, before looking through the new Delta Identity functionality!
As always, if you're beginning your lakehouse journey, or need an expert eye to guide you on your way, you can always get in touch with Advancing Analytics.
00:00 - Hello
01:37 - Existing Key Methods
10:36 - New Identity Functionality
15:18 - Testing a larger insert
In this video Simon does a quick recap of the existing surrogate key methods within Spark-based ETL processes, before looking through the new Delta Identity functionality!
As always, if you're beginning your lakehouse journey, or need an expert eye to guide you on your way, you can always get in touch with Advancing Analytics.
00:00 - Hello
01:37 - Existing Key Methods
10:36 - New Identity Functionality
15:18 - Testing a larger insert
Advancing Spark - Identity Columns in Delta
How and When to Use Databricks Identity Column
Advancing Spark - JSON Schema Drift with Databricks Autoloader
Advancing Spark - Understanding Low Shuffle Merge
Advancing Spark - How to pass the Spark 3.0 accreditation!
This can happen in Thailand
Advancing Spark - Bloom Filter Indexes in Databricks Delta
Delta Identity Column with Databricks 10.4 - crash test
Advancing Spark - Azure Databricks News August 2022
Advancing Spark - Data + AI Summit 2022 Day 1 Recap
Advancing Spark - Working with Hive
Advancing Spark - Databricks Delta Live Tables First Look
Crazy tick removal? Or fake?
Advancing Spark - Azure Databricks News Feb & March 2022
Advancing Spark - Spark 3.2 Released & Databricks Runtime 10
Advancing Spark - External Tables with Unity Catalog
Pyspark Scenarios 8: How to add Sequence generated surrogate key as a column in dataframe. #pyspark
Advancing Spark - Managing Files with Unity Catalog Volumes
Advancing Spark - Delta Sharing
Advancing Spark - Azure Databricks News June - July 2024
Advancing Spark - Tracking Lineage with Unity Catalog
Advancing Spark - Azure Databricks News Oct 2022
Advancing Spark - Data + AI Summit 2022 Day 2 Recap
Advancing Spark - Azure Databricks News May 2023
Комментарии