Redfin Analytics|python ETL pipeline with airflow|Data Engineering Project|Snowpipe|Snowflake|Part 2

preview_player
Показать описание
This is the part 2 of this Redfin Real Estate Data Analytics python ETL data engineering project using Apache Airflow, Snowpipe, snowflake and AWS services.
In this Redfin Real Estate Data Analytics python ETL data engineering project, you will learn how to connect to the Redfin data center data source to extract real estate data using python after which we will transform the data using pandas and load it into an Amazon S3 bucket. The raw data will also be loaded into an Amazon S3 bucket.
As soon as the transformed data lands inside the AWS S3 bucket, Snowpipe would be triggered which would automatically run a COPY command to load the transformed data into a snowflake data warehouse table. We would then connect PowerBi to the snowflake data warehouse to then visualize the data to obtain insight.
Apache airflow would be used to orchestrate and automate this process.
Apache Airflow is an open-source platform used for orchestrating and scheduling workflows of tasks and data pipelines. We would install the Apache-airflow on our EC2 instance to orchestrate the pipeline.
Remember the best way to learn data engineering is by doing data engineering - Get your hands dirty!
If you have any questions or comments, please leave them in the comment section below.
Please don’t forget to LIKE, SHARE, COMMENT and SUBSCRIBE to our channel for more AWESOME videos.

**Books I recommend**

***************** Commands used in this video *****************

***************** USEFUL LINKS *****************

DISCLAIMER: This video and description have affiliate links. This means when you buy through one of these links, we will receive a small commission and this is at no cost to you. This will help support us to continue making awesome and valuable contents for you.
Рекомендации по теме
Комментарии
Автор

You are legend to indians your project is more informative.

Simplelifewithgowtham
Автор

Successfully completed this project again! Much thanks to Dr Yemi.

donatus.enebuse
Автор

Just binge watched both the parts. Very well explained.
Could you please make a video of doing some Machine Learning on these big datasets.
Thanks

nitindatta
Автор

wow! thank you very much! very nice channel. Well explained with amazing projects.

luiscamilofranco
Автор

Hi, absolutely loved this project, will definitely try to use another dataset and replicate this on my own. I have a question though. Which books/resources would you recommend learning more about Airflow and Snowflake? I would really like to increase my grasp over these two domains.

JayRavalani
Автор

another great video, Thanks!! I have request about making a data validation from APi's or database and deploy this data validation on AWS and visualize/report the results. Thanks

aydemir
Автор

great showcase :))) can you include dbt as a transformation tool in your future projects

lesaplmansion
Автор

Thank you for interest project 😁
PS. No problem when connect with power BI haha 😂

nnnn
Автор

Another comment. I'm seeing a lot job postings on Microsoft Azure. Do you have a plan to make some projects along that side of the data engineering as well. And can you explain in one another seperate video to explain differences between AWS, GCP and Azure and where Snowflake is sitting in between those ? Thanks

aydemir
Автор

Seconded, I will be very much interested in your data warehousing course sir

oyekanemmanuel
Автор

if you create courses related to data engineering topics like data warehouse and big data, i would like to join.

errrbrrr
Автор

I have one request can you please put videos using large file size

Simplelifewithgowtham
Автор

Do you have any idea like how to unzip a file in S3 using airflow. With many and easy use cases.. please please please brother

Simplelifewithgowtham
join shbcf.ru