Data Modeling with DBT - Step-by-Step Tutorial for Handling 440K Records using Docker and Postgres

preview_player
Показать описание
Data modeling is a critical step in any analytics or data-centric project. Effective data modeling can lead to better insights and decisions, and can improve the overall efficiency of your data pipelines. In this step-by-step tutorial, we'll use DBT (Data Build Tool) to model data and handle 440K records data using Python, Docker, and Postgres DB.

To begin, we need to have a clear understanding of what data modeling is and why it is so important. In simple terms, data modeling is the process of creating a data model –a visual representation of the data– that describes how data is organized, including the relationships between different data entities.

The data used in these tutorial can be found at

The docker command to start the postgres instances is
docker run -dp 5431:5432 -e "POSTGRES_PASSWORD=pass" -e "POSTGRES_USER=postgres" \
-v /home/postgres-target/:/var/lib/postgresql/data \
-- name tutorialDB postgres:latest

The most popular data modeling tools out there is DBT (Data Build Tool), which is an open-source platform designed to simplify the process of building data models. DBT helps data analysts and engineers focus on the data rather than the infrastructure and complexities of the data modeling process. With DBT, you can build and test your data models locally, then deploy them to production with ease.

So, let's get started with getting our hands on the Docker Postgres Server and 440K records dataset. Once the Docker is installed in your laptop/server, then above command can be used to create the db instance as shown in the video.
The associated python script will load the dataset into the db server in no time. The script takes care of cleaning the dataset for you. You will be able to apply data modeling concepts and techniques to handle large datasets with Python, Docker and Postgres DB. You'll start to learn how to use DBT to simplify the process of building complex data models and deploy them to production with ease.

Thanks for watching this tutorial on data modeling with DBT. Don't forget to leave a like and subscribe to our channel for more data-centric tutorials and topics

PS: Got a question or have a feedback on my content. Get in touch
By leaving a Comment in the video
@twitter Handle is @KQrios
Рекомендации по теме
Комментарии
Автор

Can you please share the link for the second video on DBT

SanjayChakravarty-vf
join shbcf.ru