Azure Synapse Operators¶

Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated options—at scale. Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, transform, manage and serve data for immediate BI and machine learning needs.

AzureSynapseRunSparkBatchOperator¶

Use the AzureSynapseRunSparkBatchOperator to execute a spark application within Synapse Analytics. By default, the operator will periodically check on the status of the executed Spark job to terminate with a “Succeeded” status.

Below is an example of using this operator to execute a Spark application on Azure Synapse.

tests/system/microsoft/azure/example_azure_synapse.py[source]

run_spark_job = AzureSynapseRunSparkBatchOperator(
    task_id="run_spark_job",
    spark_pool="provsparkpool",
    payload=SPARK_JOB_PAYLOAD,  # type: ignore
)

AzureSynapseRunPipelineOperator¶

Use the: class:~airflow.providers.microsoft.azure.operators.synapse.AzureSynapseRunPipelineOperator to execute a pipeline application within Synapse Analytics. The operator will Execute a Synapse Pipeline.

tests/system/microsoft/azure/example_synapse_run_pipeline.py[source]

run_pipeline1 = AzureSynapseRunPipelineOperator(
    task_id="run_pipeline1",
    azure_synapse_conn_id="azure_synapse_connection",
    pipeline_name="Pipeline 1",
    azure_synapse_workspace_dev_endpoint="azure_synapse_workspace_dev_endpoint",
)

Reference¶

For further information, please refer to the Microsoft documentation:

Azure Synapse Analytics Documentation