How to use dbt Snapshots to track data history

preview_player
Показать описание
Get a FREE checklist to build simple, reliable data architectures (without the mess)

Some data points naturally change over time, but it happens slowly.

Unsurprisingly, this is referred to as a "slowly changing dimension" or SCD.

This is a common data modeling scenario but can take a while to get right.

But if you use dbt, you can take advantage of their built-in Snapshots feature.

Snapshots require just a few configurations to set up and the result is a new table that tracks historical changes just like a SCD.

So in this video, you'll learn more about what dbt Snapshots are and how to easily add them to your project.

Enjoy!

Timestamps:
0:00 - Intro
0:51 - What are Snapshots?
1:57 - Review Scenario
3:08 - Best Practices
3:55 - Create a Snapshot
7:45 - Add New Data
9:29 - Reference Snapshot in a Model

Title & Tags:
How to use dbt Snapshots to track data history
#kahandatasolutions #dataengineering #dbt
Рекомендации по теме
Комментарии
Автор

Get a FREE checklist to build simple, reliable data architectures (without the mess)

KahanDataSolutions
Автор

best explanation over internet for snapshots in shortest possible time

shobhitgarg
Автор

After going through the first 25% of the playbook and then stopping....here I am again googling answers to a work problem and finding your videos. Appreciate all the info and content.

(Our issues was an incorrect unique_key setting on a table where the snapshot was created HUGE tables as a result.)

jppbkm
Автор

Would be awesome a pt2 of this video, now with how a BI analyst would use for reporting, like, if we want to get the vision of the data each month, the ways we could use for this

moverecursus
Автор

Awesome. Helped in understanding the concept of Slowly changing dimensions (SCD) !

anbarasuramachandran
Автор

Thanks for making this clear videos. I find these better than official ones :p

kvin
Автор

thank you so much for your help. You make it very clear. I got mine snapshot working now.

dominicaleung
Автор

Hi, thanks for explaining this concept. Is there a way to set valid to date of ending record to d-1? This way using between in where condition will return 2 records instead of 1

rusttaf
Автор

Hi, please make a detailed video on dbt analytics certification. As I am preparing for that it would be much helpful.

Appreciate your works on data engineering

anushanr
Автор

Thank you for the video. One question,
Once snapshot is taken, it is best practice to use snapshot tables or source tables for working with other models?
if snapshot is answer, how frequently, we have to update snapshot tables? or we have to trigger them for every update

rajasekharreddy
Автор

One question, why should opt DBT for transformations? Under snowflake trail accounts it’s having millions of sample data for practice? Doesn’t mean snowflake doesn’t have the capability to do the transformations in snowflake?

Why should we create models/macros/seeds for the transformations and what is the necessity?

bashask
Автор

What if my source table doesn't have correct column names or datatypes? Should I change them in stg?

LtRogers
Автор

Thank for a very good content, I have a question how we can track deleted raw from the source ?

arbol
Автор

Hey, What have you mentioned in yml file?
As I was trying to do this way but the staging model could not refer the snapshot due to some reason.

nainatiwari
Автор

Can you please do a part 2 of this video where you explain how to snapshot using yaml instead of SQL for DBT version 1.9+ users

JaydeepMistryCA
Автор

How to implement snapshots if the target table is partitioned?

SoniaMehta-uw
Автор

is it possible to partition snapshots (especially for Bigquery?)

dimi
Автор

Hi, How we can set DBT_VALID_TO default Null values to something other max number

gauravsati
Автор

i ran from data science because of I knew I couldn't keep up with all the statistical readings

BreathOfHopePodcast
Автор

I find snowflake streams to be better than this concept

sabh
join shbcf.ru