Microsoft Azure Data Explorer¶
The Azure Data Explorer
connection type enables Azure Data Explorer (ADX) integrations in Airflow.
Authenticating to Azure Data Explorer¶
There are five ways to connect to Azure Data Explorer using Airflow.
Use AAD application certificate (i.e. use “AAD_APP” or “AAD_APP_CERT” as the Authentication Method in the Airflow connection).
Use AAD username and password (i.e. use “AAD_CREDS” as the Authentication Method in the Airflow connection).
Use a AAD device code (i.e. use “AAD_DEVICE” as the Authentication Method in the Airflow connection).
Use managed identity by setting
managed_identity_client_id
,workload_identity_tenant_id
(under the hook, it uses DefaultAzureCredential with these arguments)Fallback on DefaultAzureCredential. This includes a mechanism to try different options to authenticate: Managed System Identity, environment variables, authentication through Azure CLI and etc.
Only one authorization method can be used at a time. If you need to manage multiple credentials or keys then you should configure multiple connections.
Default Connection IDs¶
All hooks and operators related to Microsoft Azure Data Explorer use azure_data_explorer_default
by default.
Configuring the Connection¶
- Data Explorer Cluster URL
Specify the Data Explorer cluster URL. Needed for all authentication methods.
- Authentication Method
Specify authentication method. Available authentication methods are:
AAD_APP: Authentication with AAD application certificate. A Tenant ID is required when using this method. Provide application ID and application key through Username and Password parameters.
AAD_APP_CERT: Authentication with AAD application certificate. Tenant ID, Application PEM Certificate, and Application Certificate Thumbprint are required when using this method.
AAD_CREDS: Authentication with AAD username and password. A Tenant ID is required when using this method. Username and Password parameters are used for authentication with AAD.
AAD_DEVICE: Authenticate with AAD device code. Please note that if you choose this option, you’ll need to authenticate for every new instance that is initialized. It is highly recommended to create one instance and use it for all queries.
AZURE_TOKEN_CRED: Authentication with DefaultAzureCredential. This includes a mechanism to try different options to authenticate: Managed System Identity, environment variables, authentication through Azure CLI and etc. Only the “Data Explorer Cluster URL” is required when using this method.
- Username (optional)
Specify the username used for data explorer. Needed for with AAD_APP, AAD_APP_CERT, and AAD_CREDS authentication methods.
- Password (optional)
Specify the password used for data explorer. Needed for with AAD_APP, and AAD_CREDS authentication methods.
- Tenant ID (optional)
Specify AAD tenant. Needed for AAD_APP, AAD_APP_CERT, and AAD_CREDS.
- Application PEM Certificate (optional)
Specify the certificate. Needed for AAD_APP_CERT authentication method.
- Application Certificate Thumbprint (optional)
Specify the thumbprint needed for use with AAD_APP_CERT authentication method.
- Managed Identity Client ID (optional)
The client ID of a user-assigned managed identity. If provided with
workload_identity_tenant_id
, they’ll pass to DefaultAzureCredential.- Workload Identity Tenant ID (optional)
ID of the application’s Microsoft Entra tenant. Also called its “directory” ID. If provided with
managed_identity_client_id
, they’ll pass to DefaultAzureCredential.
When specifying the connection in environment variable you should specify it using URI syntax.
Note that all components of the URI should be URL-encoded.
For example:
export AIRFLOW_CONN_AZURE_DATA_EXPLORER_DEFAULT='azure-data-explorer://add%20username:add%20password@mycluster.com?auth_method=AAD_APP&tenant=tenant+id'