How to run an Airflow DAG in a loop

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (63/100)

Airflow does not support DAGs with loops. After all, the abbreviation DAG stands for Directed Acyclic Graph, so we can’t have cycles. It is also not the standard usage of Airflow, which was built to support daily batch processing.

All of that does not stop us from using a simple trick that lets us run a DAG in a loop. To do that, we have to add a TriggerDagRunOperator as the last task in the DAG. In the task configuration, we specify the DAG id of the DAG that contains the task:

1
2
3
4
5
6
7
8
9
from airflow.operators.dagrun_operator import TriggerDagRunOperator

trigger_self = TriggerDagRunOperator(
    task_id='repeat'
    trigger_dag_id=dag.dag_id,
    dag=dag
)

the_rest_of_the_dag >> trigger_self  # add it as the last task

Subscribe to the newsletter and join the free email course.


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * MLOps Engineer / data engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.