Creating an ETL Data Pipeline on Google Cloud with Cloud Data Fusion & Airflow - Part 1

preview_player
Показать описание

Creating an ETL Data Pipeline on Google Cloud with Cloud Data Fusion & Airflow
Explore the magic of building an ETL pipeline in Google Cloud with this comprehensive tutorial. Learn how to craft a seamless process for extracting, transforming, and loading data into BigQuery, then visualize it effortlessly in Looker Studio.

Step 1: Begin by extracting dummy employee data using the Python Faker library, seamlessly storing it in a designated Google Cloud Storage (GCS) bucket.

Step 2: Dive into the creation of a Cloud Fusion instance, setting up the groundwork for your data pipeline journey.

Step 3: Unveil the magic of Data Fusion as you craft a robust pipeline. Witness the transformation of data while ensuring sensitive information remains masked, ultimately loading it into BigQuery for further analysis.

Step 4: Elevate your data visualization game as you harness the power of Looker Studio, bringing your insights to life in a visually compelling manner.

Join me on this illuminating journey through the intricacies of ETL pipelines, empowering you to master data orchestration and visualization in the Google Cloud ecosystem.

Looking to get in touch?

Playlists
Associate Cloud Engineer -Complete Free Course

Google Cloud Data Engineer Certification Course

Google Cloud Platform(GCP) Tutorials

Generative AI

Getting Started with Duet AI

Google Cloud Projects

Python For GCP

Terraform Tutorials

Linkedin

Medium Blog

Github
Source Code

#googlecloud #gcp #airflow #dataengineeringessentials #dataengineering #bigquery #dataengineeringprojects
Рекомендации по теме
Комментарии
Автор

Thanks Vishal for the detailed pipeline design and development video. Great job.

rajeshiyer
Автор

Thank You Vishal for doing this. It will be definitely a great help! Kudos to you!

AR-bylk
Автор

not getting mask data option in wrangler

basavrajningadali
Автор

i am getting more environment error while connecting data fusion and python code has error

abhisheknaidu
Автор

Nice video, can you create a pipeline using server / serverless dataproc.?

abdulfasith
Автор

Great video as always ! Can you do make a timestamp for this video ?

renvils
Автор

in place of Airflow i want to use Mage ai.

Alfred_vinci
Автор

cloud composer environment showing error and image version not showing while creating environment manually..is their any update

adityajoshi
Автор

awesome video, can you create complete composer airflow video for this one

selvaarul
Автор

How to use gcloud in vs code?
Error: gcloud : The term 'gcloud' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct
and try again

TeekawinKirdsaeng
Автор

I got these errors "Cannot load filesystem: Provider not found. Can not load the default value of `spark.yarn.isHadoopProvided` from with error, Using `false` as a default value." Any clues on how to fix it?

yishanzhan
Автор

Amazing video, unfortunately I have problems creating my cloud composer environment, maybe because I am in a free trial.
I get this error after create the environment:
CREATE operation on this environment failed 49 minutes ago with the following error message:
Some of the GKE pods failed to become healthy. Please check the GKE logs for details, and retry the operation.

lmarwarl
Автор

composer shows "This environment has errors"

promitdutta
Автор

kindly make this kind of pipeline ETL video with the

punk
Автор

its written gcloud is not an executable so your login stuff doesnt work with everyone and you did stuffs before without telling it in video. please next time show everything from scratch, i mean for real, not saying but doing it in reality too

flosrv
Автор

Fusion is not parsing the salary and many fields although they are in the csv

rishabhtiwari