Building Robust Data Pipelines for Modern Data Engineering | End to End Data Engineering Project

preview_player
Показать описание
In this video, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider. This project illustrate the process of data ingestion to the lakehouse, data integration with ADF and data transformation with Databricks, and DBT.

Timestamp:
0:00 Introduction
0:49 System Architecture
3:01 Creating resource groups on Azure
5:02 Setting up the medallion architecture storage account
8:46 Setting up Azure Data Factory
10:18 Azure Key Vault setup for secrets
14:19 Azure database with automatic data population
25:32 Azure Data Factory pipeline orchestration
47:00 Setting up Databricks
49:50 Azure Databricks Secret Scope and Key Vault
54:33 Verifying Databricks - Key Vault - Secret Scope Integration
1:06:00 Azure Data Factory - Databricks Integration
1:21:19 DBT Setup
1:24:15 DBT Configuration with Azure Databricks
1:32:12 DBT Snapshots with Azure Databricks and ADLS Gen2
1:45:06 DBT Data Marts with Azure Databricks and ADLS Gen2
1:55:00 DBT Documentation
1:58:58 Outro

Resources:

If you find our content valuable, support us by joining our channel membership, where you'll get exclusive access to behind-the-scenes content, Q&A sessions, and much more!

💬 Join the Conversation:
We love hearing from you! Share your thoughts, questions, or experiences related to data engineering or this project in the comments below. Don't forget to like, subscribe, and hit the bell icon to stay updated with our latest content.

Tags:
Big Data, Data Engineering, Apache Spark, Databricks, DBT, Azure, Cloud Computing, Data Analytics, ETL, Data Warehouse, Technology, Analytics, Machine Learning, Data Science

Hashtags:
#BigData, #DataEngineering, #ApacheSpark, #Databricks, #DBT, #Azure, #CloudComputing, #DataAnalytics, #ETL, #DataWarehouse, #TechTalk, #MachineLearning, #DataScience, #BigDataAnalytics

🙏 Thank You for Watching!
Remember to subscribe and hit the bell icon for notifications. Stay curious and keep exploring the fascinating world of data engineering!
Рекомендации по теме
Комментарии
Автор

Spark your curiosity and 'data-fy' your feed - hit LIKE, SUBSCRIBE, and ring the bell. Join our byte-sized revolution in data engineering!💡🚀

CodeWithYu
Автор

So happy i found this!! It's brilliant! You are a fantastic teacher.

vemedia
Автор

Awesome content, and very instructive and educational. Thanks a lot, sir.

dotproduct
Автор

You are the best! Keep up the good work!

rasmusandreasson
Автор

Nice to see such a diamond content on the youtube. Please keep it bro. Your youtube lecture is very helpful resource for Indian student .❤❤

CodewithAIYoutubeChannel
Автор

🤗 thank you for your hard work, we appreciate it 🙏

nadiiar
Автор

Thank you for your hard work, you are the best

wiss
Автор

It would be nice if you use Terraform to setup the cloud services! and again great video!

RodrigoBlaudt
Автор

Would recommend this for a data scientist who is just getting started with data engineering

newagegenre
Автор

Thank you very much for your content!

Quick question: I am able to see the snapshot in the Databricks, whereas not in the silver layer under Storage account. Can you suggest what could have gone wrong?

himanshugandhi
Автор

Awesome content, thank you so much! I never worked on dbt before, just curious what is advantage of using dbt along with databrics when databricks itself is a compute engine?

deede
Автор

Thanks for posting this video. I have a question- so u r using data factory for data ingestion and data bricks for transformation, so where does dbt come in? Isn’t the core purpose of dbt is to be used for transformation?

dharmiknaik
Автор

Its really Nice Project...I expected Some Analysis dashboards at final like Looker/PowerBI type any Advanced Dashboards wth SQL queries, but not there?, is it only Transformation....

naren
Автор

Great videos man. Do you have any end to end projects involving snowflake? I see snowflake a lot in job specifications, would like to get up to speed on this.

soundbeans
Автор

Hi! would it make sense implementing here Azure Terraform as a databricks option to deploy dbt?

RafaVeraDataEng
Автор

I hope you are doing well.
I have a question:
Let's suppose I have PostgreSQL running on my local machine with Windows 11, and I create a script inside a container in Docker, but I am not able to connect to the database.
I tried many things, like port mapping and everything, but it still doesn't work.
Is this possible on Windows or not?

aliel-azzaouy
Автор

if we want to write about this project on Resume what exactly do we need to mention?

PoojaSharma-pS
Автор

Thank you!!! I have an issue with installing pip install dbt-databricks which impacts the final part of this project. Any suggestion, please? Thanks

francescolombardi
Автор

@CodeWithYu Do you have any architecture video like this as a Data Engineer ?

marce_f
Автор

Hey just out of curiosity

How much did the Adls cost? And overall out of 200$

ragegodoverpowered
visit shbcf.ru