Apache Airflow for Data Science #2 - How to Write Your First Airflow DAG (Data Pipeline)

preview_player
Показать описание
Apache Airflow is a common tool used by Data Engineers. Learn how to write your first data pipeline (DAG) in 10 minutes.

00:00 Introduction
00:37 Initial setup
01:21 Airflow DAG boilerplate code
04:07 Task #1 - Get current datetime
05:46 Task #2 - Process current datetime
09:50 Task #3 - Save processed datetime
14:45 How to connect the Airflow DAG
16:26 How to run the Airflow DAG
17:36 Outro

FOLLOW BETTER DATA SCIENCE

FREE “LEARN DATA SCIENCE MASTERPLAN” EBOOK

GEAR I USE
Рекомендации по теме
Комментарии
Автор

I appreciate your efforts for this video. It deserves more than likes and comments. Great Job Mate

RajeshSamson
Автор

Thank you, Dario. I have only recently started learning Apache Airflow.

paleface_brother
Автор

how come after i follow the steps for part 1 and this one, i have airflow imports could not resolved? i have installed it in part one.

damiencheung
Автор

the process_datetime keeps giving an error

damolaolayinka-osho
Автор

ModuleNotFoundError: No module named 'pandas'
how to solve this error bro

karthickraja
Автор

File "/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py", line 171, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py", line 189, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/postgres_db_dag.py", line 24, in process_iris_data
raise Exception('No data.')

Could you help me sir? where i went wrong?

rohmankpaii
Автор

Hi @Better Data Science ..
I wrot the same script as you but ti.xcom_pull return NoneType instead of the datetime .. do you have a solution please ? thanks

salimmzoughi