Provider packages

Apache Airflow 2 is built in modular way. The “Core” of Apache Airflow provides core scheduler functionality which allow you to write some basic tasks, but the capabilities of Apache Airflow can be extended by installing additional packages, called providers.

Providers can contain operators, hooks, sensor, and transfer operators to communicate with a multitude of external systems, but they can also extend Airflow core with new capabilities.

You can install those provider packages separately in order to interface with a given service. The providers for Apache Airflow are designed in the way that you can write your own providers easily. The Apache Airflow Community develops and maintain more than 80 provider packages, but you are free to develop your own providers - the providers you build have exactly the same capability as the providers written by the community, so you can release and share those providers with others.

If you want to learn how to build your own custom provider, you can find all the information about it at How to create your own provider.

The full list of all the community managed providers is available at Providers Index.

You can also see index of all the community provider’s operators and hooks in Operators and Hooks Reference

Extending Airflow core functionality

Providers give you the capability of extending core Airflow with extra capabilities. The Core airflow provides basic and solid functionality of scheduling, the providers extend its capabilities. Here we describe all the custom capabilities.

Airflow automatically discovers which providers add those additional capabilities and, once you install provider package and re-start Airflow, those become automatically available to Airflow Users.

The summary of all the core functionalities that can be extended are available in Core Extensions.

Configuration

Providers can have their own configuration options which allow you to configure how they work:

You can see all community-managed providers with their own configuration in Configurations

Auth backends

The providers can add custom authentication backends, that allow you to configure the way how your web server authenticates your users, integrating it with public or private authentication services.

You can see all the authentication backends available via community-managed providers in Auth backends

Custom connections

The providers can add custom connection types, extending connection form and handling custom form field behaviour for the connections defined by the provider.

You can see all the custom connections available via community-managed providers in Connections.

Logging

The providers can add additional task logging capabilities. By default Apache Airflow saves logs for tasks locally and make them available to Airflow UI via internal http server. However, providers can add extra logging capabilities, where Airflow Logs can be written to a remote service and retrieved from those services.

You can see all task loggers available via community-managed providers in Writing logs.

Secret backends

Airflow has the capability of reading connections, variables and configuration from Secret Backends rather than from its own Database.

You can see all the secret backends available via community-managed providers in Secret backends.

Notifications

The providers can add custom notifications, that allow you to configure the way how you would like to receive notifications about the status of your tasks/DAGs.

You can see all the notifications available via community-managed providers in Notifications.

Installing and upgrading providers

Separate provider packages give the possibilities that were not available in 1.10:

  1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade.

  2. You can downgrade to previous version of particular provider in case the new version introduces some problems, without impacting the main Apache Airflow core package.

  3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This means that you can incrementally validate each of the provider package update in your environment, following the usual tests you have in your environment.

Types of providers

Providers have the same capacity - no matter if they are provided by the community or if they are third-party providers. This chapter explains how community managed providers are versioned and released and how you can create your own providers.

Community maintained providers

From the point of view of the community, Airflow is delivered in multiple, separate packages. The core of Airflow scheduling system is delivered as apache-airflow package and there are more than 80 provider packages which can be installed separately as so called Airflow Provider packages. Those packages are available as apache-airflow-providers packages - for example there is an apache-airflow-providers-amazon or apache-airflow-providers-google package).

Community maintained providers are released and versioned separately from the Airflow releases. We are following the Semver versioning scheme for the packages. Some versions of the provider packages might depend on particular versions of Airflow, but the general approach we have is that unless there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details will vary per-provider and if there is a limitation for particular version of particular provider, constraining the Airflow version used, it will be included as limitation of dependencies in the provider package.

Each community provider has corresponding extra which can be used when installing airflow to install the provider together with Apache Airflow - for example you can install airflow with those extras: apache-airflow[google,amazon] (with correct constraints -see Installation of Airflow®) and you will install the appropriate versions of the apache-airflow-providers-amazon and apache-airflow-providers-google packages together with Apache Airflow.

Some of the community providers have cross-provider dependencies as well. Those are not required dependencies, they might simply enable certain features (for example transfer operators often create dependency between different providers. Again, the general approach here is that the providers are backwards compatible, including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other provider packages are automatically documented in the release notes of every provider.

Note

For Airflow 1.10 We also provided apache-airflow-backport-providers packages that could be installed with those versions Those were the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. The last release of backport providers was done on March 17, 2021 and the backport providers will no longer be released, since Airflow 1.10 has reached End-Of-Life as of June 17, 2021.

If you want to contribute to Apache Airflow, you can see how to build and extend community managed providers in https://github.com/apache/airflow/blob/main/airflow/providers/MANAGING_PROVIDERS_LIFECYCLE.rst.

Was this entry helpful?