Listener Plugin of Airflow

Airflow has feature that allows to add listener for monitoring and tracking the task state using Plugins.

This is a simple example listener plugin of Airflow that helps to track the task state and collect useful metadata information about the task, dag run and dag.

This is an example plugin for Airflow that allows to create listener plugin of Airflow. This plugin works by using SQLAlchemy’s event mechanism. It watches the task instance state change in the table level and triggers event. This will be notified for all the tasks across all the DAGs.

In this plugin, an object reference is derived from the base class airflow.plugins_manager.AirflowPlugin.

Listener plugin uses pluggy app under the hood. Pluggy is an app built for plugin management and hook calling for Pytest. Pluggy enables function hooking so it allows building “pluggable” systems with your own customization over that hooking.

Using this plugin, following events can be listened:
  • task instance is in running state.

  • task instance is in success state.

  • task instance is in failure state.

  • dag run is in running state.

  • dag run is in success state.

  • dag run is in failure state.

  • on start before event like airflow job, scheduler or backfilljob

  • before stop for event like airflow job, scheduler or backfilljob

Listener Registration

A listener plugin with object reference to listener object is registered as part of airflow plugin. The following is a skeleton for us to implement a new listener:

from airflow.plugins_manager import AirflowPlugin

# This is the listener file created where custom code to monitor is added over hookimpl
import listener

class MetadataCollectionPlugin(AirflowPlugin):
    name = "MetadataCollectionPlugin"
    listeners = [listener]

Next, we can check code added into listener and see implementation methods for each of those listeners. After the implementation, the listener part gets executed during all the task execution across all the DAGs

For reference, here’s the plugin code within class that shows list of tables in the database:

This example listens when the task instance is in running state


def on_task_instance_running(previous_state: TaskInstanceState, task_instance: TaskInstance, session):
    This method is called when task state changes to RUNNING.
    Through callback, parameters like previous_task_state, task_instance object can be accessed.
    This will give more information about current task_instance that is running its dag_run,
    task and dag information.
    print("Task instance is in running state")
    print(" Previous state of the Task instance:", previous_state)

    state: TaskInstanceState = task_instance.state
    name: str = task_instance.task_id
    start_date = task_instance.start_date

    dagrun = task_instance.dag_run
    dagrun_status = dagrun.state

    task = task_instance.task

        assert task

    dag = task.dag
    dag_name = None
    if dag:
        dag_name = dag.dag_id
    print(f"Current task name:{name} state:{state} start_date:{start_date}")
    print(f"Dag name:{dag_name} and current dag run status:{dagrun_status}")

Similarly, code to listen after task_instance success and failure can be implemented.

This example listens when the dag run is change to failed state


def on_dag_run_failed(dag_run: DagRun, msg: str):
    This method is called when dag run state changes to FAILED.
    print("Dag run  in failure state")
    dag_id = dag_run.dag_id
    run_id = dag_run.run_id
    external_trigger = dag_run.external_trigger

    print(f"Dag information:{dag_id} Run id: {run_id} external trigger: {external_trigger}")

Similarly, code to listen after dag_run success and during running state can be implemented.

The listener plugin files required to add the listener implementation is added as part of the Airflow plugin into $AIRFLOW_HOME/plugins/ folder and loaded during Airflow startup.

Was this entry helpful?