To Surrogate Key or Not...

Показать описание

We've talked about using a surrogate key in your data warehouse whether that's Azure Synapse Analytics or something else. Patrick looks at why you should consider this even if you aren't using a slowly changing dimension.

*******************

Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.

*******************
LET'S CONNECT!
*******************

***Gear***

#AzureSynapse #DataWarehouse #GuyInACube

Рекомендации по теме

Комментарии

Surrogate keys are THE most critical piece for good model management. Management of keys pushes you to ensure the proper Schema.

Do it! Do it!

ChrisWagnerDatagod

Incredible! Keep these SQL theory vids coming. This is high-level stuff. Thanks Patrick for continuing to be there for us enterprise level SQL folks. YOU DA MAN

eziola

On a related note, I've seen users replace string-based natural keys with integer-based surrogate keys in their Power BI data models, assuming the smaller integer data type would be more performant. However, most of the time, the VertiPaq engine will hash these integers, making them strings anyways, so all the user is really doing is adding complexity to their transformations and compute to their refreshes.

jeffshieldsdev

and now the question is beacuse i not have options to use identity in DWH in microsoft fabric?

vicenteantoniomagallanesju

Important to understand when to use them, and also when not to. Sometimes its very easy to overcomplicate your models with an insistence on 'incremental' SKs, even when they are not needed or even costly (e.g. in columnar datasets). I prefer the Super Natural Key / Durable Key model, or even better, having natural dimensional fields actually in the facts for the columnar data compression to work with.

JohnBrown-hxbq

Nice simple explanation of when & why to use surrogate keys. Particularly the why - too often surrogate keys are presented as being good design always, even when they aren't needed and the natural key will work fine.

Side note:

One habit that doesn't seem to go away, but has no value vs using the natural DATE is creating a YYYYMMDD integer surrogate key for a date (aka Calendar) table.

Bad: The INT has more bytes than DATE - it will cause more IO. So it will be slower, even if only marginally.
Bad: It has no integrity enforcement as a valid date.
Bad: It is awkward to manipulate as a date - e.g. using date functions.

Dates as 4 byte INT has not had value since the introduction of SQL Server 2008, with the 3 byte DATE type.

pauloneufneufneuf

How about you need to allow to user to add like a budget fact table to the model (Copositive model) then they will need to know the SK value of an Item, if the Item Dimension has NK then they don't, just add it and it works.

etacarl

Not my first priority. But you hit on the best reason to use it, if your IDs like CustomerID may change or be augmented in future.

MilhouseBS

Q: i don’t understand the implementation of surrogate keys. I feel like I’m missing an obvious and low compute way of maintaining all the surrogate keys on your facts. No videos I’ve seen discuss this. But it seems every time a new fact record is generated you have to join every related dim on the foreign natural key and update the fact with the dim’s related surrogate key. So that you can later perform joins using the surrogate key. Am I thinking through this correctly?

dunlapww

Nice video. But i have question. What if customerId was a varchar do we still need to have surrogate key?can you please clarify?

umeshchanchlani

During regular load then dimensions needs to be loaded first then facts by look-up on dimensions for keys

Vikasptl

Good vídeo, I'm going to share this with my team! Also the new source system could have overlapping IDs (even thou it is varchar it can contain ints)

DanielRodriguesPT

Maybe i'm missing some thing, once there is a new line in the customer dimension table how can the fact table get the new CustomerSK?

maherkebaier

If the data you are processing has or could have more than one source, and you want this representation, the information you are providing, to be used by any other process, then an independent (surrogate) key is needed to represent the composite information.

If you were to represent a multi-source dataset with the key from one of its sources, you would be changing the meaning of the source keyfield and adding a processing dependency.

I would argue that every dataset should have an independent keyset because change is inevitable in real-world processes. Certainly, for repositories that are intended to be long-lived data warehouses, independent (surrogate) keys should be the default for the reasons illustrated here, among others.

optimalchoice

I don’t think customer id is a natural key. Natural keys are derived from the columns. If the id is chosen arbitrarily, then it’s synthetic

fb-guer

To Surrogate Key or Not...

To Surrogate Key or Not...

Can't create a relationship? Time for a surrogate key in Power BI!

Database Design 25 - Surrogate Key and Natural Key

Primary key and Surrogate key in DATABASE

Twitter Wars! Natural vs Surrogate Keys

Database Keys Made Easy - Primary, Foreign, Candidate, Surrogate, & Many More

11 - Surrogate Key Considerations

Natural vs. Surrogate Keys: Making Informed Choices for Database Design

What is surrogate key in Data Modelling?

Surrogate Keys | SQL

How to use Primary Keys | When are surrogate keys used?

Topic 02, Part 06 - Primary Keys and Surrogate Keys

Surrogate Key #sql #sqltutorialforbeginners #sqliseasy

Relational Database Concepts: Natural Keys, Candidate Keys, Surrogate Keys, Secondary Keys

surrogate key #sql

Types of Keys in Database |Primary Key |Candidate Key |Super Key |Surrogate Key| Natural Key

Database Design 26 - Should I use Surrogate Keys or Natural Keys?

How to use Surrogate Key Transformation in Mapping Data Flow Activity in Azure Data Factory 2022

What is Surrogate Key | Data Warehouse Tutorial For Beginners | Data Warehouse Concepts (13/30)

Snowflake Sequences & Surrogate keys.

Difference between UNIQUE , PRIMARY and SURROGATE KEYS

SQL : What is the difference between a primary key and a surrogate key?

How to Replace a Natural Key with a Surrogate Key (Autonumber) in Microsoft Access

Surrogate Key vs Natural Key | #shorts #shortvideo #shortsvideo #subscribe #short #datawarehouse