Writing Logs to Azure Blob Storage

Airflow can be configured to read and write task logs in Azure Blob Storage. It uses an existing Airflow connection to read or write logs. If you don't have a connection properly setup, this process will fail.

Follow the steps below to enable Azure Blob Storage logging:

  1. Airflow's logging system requires a custom .py file to be located in the PYTHONPATH, so that it's importable from Airflow. Start by creating a directory to store the config file, $AIRFLOW_HOME/config is recommended.

  2. Create empty files called $AIRFLOW_HOME/config/log_config.py and $AIRFLOW_HOME/config/__init__.py.

  3. Copy the contents of airflow/config_templates/airflow_local_settings.py into the log_config.py file created in Step 2.

  4. Customize the following portions of the template:

    # wasb buckets should start with "wasb" just to help Airflow select correct handler
    REMOTE_BASE_LOG_FOLDER = 'wasb-<whatever you want here>'
  5. Make sure a Azure Blob Storage (Wasb) connection hook has been defined in Airflow. The hook should have read and write access to the Azure Blob Storage bucket defined above in REMOTE_BASE_LOG_FOLDER.

  6. Update $AIRFLOW_HOME/airflow.cfg to contain:

    remote_logging = True
    logging_config_class = log_config.LOGGING_CONFIG
    remote_log_conn_id = <name of the Azure Blob Storage connection>
  7. Restart the Airflow webserver and scheduler, and trigger (or wait for) a new task execution.

  8. Verify that logs are showing up for newly executed tasks in the bucket you have defined.

Was this entry helpful?