Dynamically Generating DAGs in Airflow

preview_player
Показать описание
Since the release of dynamic task mapping in Airflow 2.3, many of the concepts in this webinar have been changed and improved upon. Please check out our newer Dynamic Tasks in Airflow webinar for the latest dynamic dag best practices, including how dynamic tasks can accomplish many of the same use cases more efficiently.

The simplest way of creating an Airflow DAG is to write it as a static Python file. However, sometimes manually writing DAGs isn't practical.

Maybe you have hundreds or thousands of DAGs that do similar things, with just a parameter changing between them. Or maybe you need a set of DAGs to load tables, but don't want to manually update DAGs every time those tables change. In these cases, and others, it can make more sense to dynamically generate DAGs. Because everything in Airflow is code, you can dynamically generate DAGs using Python alone.

In this webinar, we'll talk about when you might want to dynamically generate your DAGs, show a couple of methods for doing so, and discuss problems that can arise when implementing dynamic generation at scale.

In this webinar we cover:
- How Airflow identifies a DAG
- Use cases for dynamically generating DAGs
- Commonly used methods for dynamic generation
- Pitfalls and common issues with dynamic generation

#learnwithastronomer #dags #dynamicdags
Рекомендации по теме
Комментарии
Автор

Finally dynamic DAG creation made simple in this video! Thank you!

magicgoku
Автор

thanks for the great talk. We'll be happy to hear more about dynamic tasks creation, especially ones that are effected from the results of previous tasks in the DAG.

talnagar
Автор

Really useful and very clearly explained. Thanks!

joshuaisland
Автор

Thanks for this great content. Much Appreciated :)

mobeenmehdi
Автор

Thank you so much
This is what i was looking for
Can we have more sessions like these?

rohit
Автор

Say I have a dag run happening and dynamically I updated the dag tasks.. Will it break the existing dag run? Say that particular dag run has 10 tasks to do, and I update dag when it's doing 1st task. Will it implement old tasks and newly run dag run implement new updated tasks??

aditya
Автор

I saved all dags, each dag has multiple tasks with dependency definition into a single config file, each dag has its own scheduler, each task has this own customer handler. I could dynamically all dags pretty well. Only challenge is my tasks in dags are queued even I put None or @once for schedule_interval

fanzhang
Автор

thank you for this tutorial ! I followed the 1st method of generating the DAGs (single file) and I see the DAGs being generated on the UI but when I try to run it, I see an error from the executor which says dag_id could not be found: <dag_id>. Either the dag did not exist or it failed to parse."

Even though the code below adds the dag to the global scope, I am wondering why it is NOT able to find the dag that has been generated when I try to run it:

globals()[dag_id] = create_dag(dag_id,
schedule,
dag_number,
default_args)

I am not able to figure out why the executor is NOT able to find the dag that has been generated.

TheKaluve
Автор

Is there any wat to get list of dag id using python ?

shiva_r