Supported classes¶
Below is a list of Operators and Hooks that support OpenLineage extraction, along with specific DB types that are compatible with the SQLExecuteQueryOperator.
Important
While we strive to keep the list of supported classes current, please be aware that our updating process is automated and may not always capture everything accurately. Detecting hook level lineage is challenging so make sure to double check the information provided below.
Tip
You can easily implement OpenLineage support for any operator. See Implementing OpenLineage in Operators.
Core operators¶
At the moment, two core operators support OpenLineage. These operators function as a ‘black box,’ capable of running any code, which might limit the extent of lineage extraction. To enhance the extraction of lineage information, operators can utilize the hooks listed below that support OpenLineage.
PythonOperator
(viaairflow.providers.openlineage.extractors.python.PythonExtractor
)BashOperator
(viaairflow.providers.openlineage.extractors.bash.BashExtractor
)
Spark operators¶
The OpenLineage integration can automatically inject information into Spark application properties when its being submitted from Airflow. The following is a list of supported operators along with the corresponding information that can be injected.
apache-airflow-providers-google¶
DataprocSubmitJobOperator
Parent Job Information
SQLExecuteQueryOperator
¶
uses SQL parsing for lineage extraction. To extract unique data from each database type, a dedicated Hook implementing OpenLineage methods is required. Currently, the following databases are supported:
MySql (via
MySqlHook
)PgVector (via
PgVectorHook
)Postgres (via
PostgresHook
)RedshiftSQL (via
RedshiftSQLHook
)Snowflake (via
SnowflakeHook
)Trino (via
TrinoHook
)
Providers¶
The operators and hooks listed below from each provider are natively equipped with OpenLineage support.