Writing logs to HDFS

Remote logging to HDFS uses an existing Airflow connection to read or write logs. If you don’t have a connection properly setup, this process will fail.

Enabling remote logging

To enable this feature, airflow.cfg must be configured as follows:

[logging]
# Airflow can store logs remotely in HDFS. Users must supply a remote
# location URL (starting with either 'hdfs://...') and an Airflow connection
# id that provides access to the storage location.
remote_logging = True
remote_base_log_folder = hdfs://some/path/to/logs
remote_log_conn_id = webhdfs_default

In the above example, Airflow will try to use WebHDFSHook('webhdfs_default').

Was this entry helpful?