Data Automation (CI/CD) with a Real Life Example

preview_player
Показать описание
Get my Modern Data Essentials training (for free) & start building more reliable data architectures

-----

One of the most fun aspects of being a data engineer is creating different automations.

And in particular, one area that's really important is CI/CD , which stands for Continuous Integration and Continuous Deployment.

This is where you can automate your testing and release strategy.

But I also understand that this concept can be a little vague or unclear if you haven't seen it in action.

So in today's video I'll show you a real-life example of how to use Github to make this happen.

This is just one example of why people really like code-based tools because of the ability to automate and do things like this.

This can be applied not only to your deployments but as we'll mostly cover in this video the idea of automating your data quality checks.

Thank you for watching!

Timestamps:
0:00 - Intro
0:43 - Create Workflow File
1:18 - Review File Layout
2:55 - Use Pre-build Actions
4:06 - Trigger Workflow

Title & Tags:
Data Automation (CI/CD) with a Real Life Example
#kahandatasolutions #dataengineering #automation
Рекомендации по теме
Комментарии
Автор

Get my Modern Data Essentials training (for free) & start building more reliable data architectures

KahanDataSolutions
Автор

I was literally going to research on ci/cd and I randomly enter youtube to watch a Dallas Mavs podcast and I see your video.
Watched and Understood. Thank you

amazing-graceolutomilayo
Автор

Awesome Content, any chances you'd be able to do a video on DBT cloud CI? My team is using DBT cloud and we're definitely gonna be implementing the slim CI jobs.

Fajita_boi_swag
Автор

We're in the process of setting much of this up. We want our end workflow to be something like
* Clone PROD database (Using Snowflake ZCC)
* dbt build against that Clone
* Perform a data-diff between PROD and Clone
* Report results of the dbt build AND any data-diff variations for review
* Bring down the Clone

We have a bit of background work to do until we're ready for that, so for now we just do a `dbt compile` step - it doesn't catch everything but does at least catch simple syntax issues, or stuff like invalid docs syntax etc

aldredd
Автор

Really wish I could get a job soon so I could get to implement all this!

naraendrareddy
Автор

Great video! Just been wanting to learn ci/cd with dbt
but how does it test functionality of dbt models, doesn't it need actual data on which dbt models operate? I thought GitHub actions run on some ephemeral machine for GitHub ci/cd (which doesn't have access to my data), or I'm wrong?

melnikovjnr
Автор

i guess this all a bit too advanced for me: where are you doing all these changes in the beginning? VS Code? I cannot see the full window so difficult to follow

eugenmalatov
Автор

If I follow this process, does this mean I would not need any kubernetes cluster and argo workflow?

datalearningsihan