airflow.providers.dbt.cloud.operators.dbt

Module Contents

Classes

DbtCloudRunJobOperatorLink

Allows users to monitor the triggered job run directly in dbt Cloud.

DbtCloudRunJobOperator

Executes a dbt Cloud job.

DbtCloudGetJobRunArtifactOperator

Download artifacts from a dbt Cloud job run.

DbtCloudListJobsOperator

List jobs in a dbt Cloud project.

Bases: airflow.models.BaseOperatorLink

Allows users to monitor the triggered job run directly in dbt Cloud.

name = 'Monitor Job Run'[source]

Link to external system.

Note: The old signature of this function was (self, operator, dttm: datetime). That is still supported at runtime but is deprecated.

Parameters
  • operator (airflow.models.BaseOperator) – The Airflow operator object this link is associated to.

  • ti_key – TaskInstance ID to return link for.

Returns

link to external system

class airflow.providers.dbt.cloud.operators.dbt.DbtCloudRunJobOperator(*, dbt_cloud_conn_id=DbtCloudHook.default_conn_name, job_id, account_id=None, trigger_reason=None, steps_override=None, schema_override=None, wait_for_termination=True, timeout=60 * 60 * 24 * 7, check_interval=60, additional_run_config=None, reuse_existing_run=False, retry_from_failure=False, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]

Bases: airflow.models.BaseOperator

Executes a dbt Cloud job.

See also

For more information on how to use this operator, take a look at the guide: Trigger a dbt Cloud Job

Parameters
  • dbt_cloud_conn_id (str) – The connection ID for connecting to dbt Cloud.

  • job_id (int) – The ID of a dbt Cloud job.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • trigger_reason (str | None) – Optional. Description of the reason to trigger the job. Defaults to “Triggered via Apache Airflow by task <task_id> in the <dag_id> DAG.”

  • steps_override (list[str] | None) – Optional. List of dbt commands to execute when triggering the job instead of those configured in dbt Cloud.

  • schema_override (str | None) – Optional. Override the destination schema in the configured target for this job.

  • wait_for_termination (bool) – Flag to wait on a job run’s termination. By default, this feature is enabled but could be disabled to perform an asynchronous wait for a long-running job run execution using the DbtCloudJobRunSensor.

  • timeout (int) – Time in seconds to wait for a job run to reach a terminal status for non-asynchronous waits. Used only if wait_for_termination is True. Defaults to 7 days.

  • check_interval (int) – Time in seconds to check on a job run’s status for non-asynchronous waits. Used only if wait_for_termination is True. Defaults to 60 seconds.

  • additional_run_config (dict[str, Any] | None) – Optional. Any additional parameters that should be included in the API request when triggering the job.

  • reuse_existing_run (bool) – Flag to determine whether to reuse existing non terminal job run. If set to true and non terminal job runs found, it use the latest run without triggering a new job run.

  • retry_from_failure (bool) – Flag to determine whether to retry the job run from failure. If set to true and the last job run has failed, it triggers a new job run with the same configuration as the failed run. For more information on retry logic, see: https://docs.getdbt.com/dbt-cloud/api-v2#/operations/Retry%20Failed%20Job

  • deferrable (bool) – Run operator in the deferrable mode

Returns

The ID of the triggered dbt Cloud job run.

template_fields = ('dbt_cloud_conn_id', 'job_id', 'account_id', 'trigger_reason', 'steps_override',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

execute_complete(context, event)[source]

Execute when the trigger fires - returns immediately.

on_kill()[source]

Override this method to clean up subprocesses when a task instance gets killed.

Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.

hook()[source]

Returns DBT Cloud hook.

get_openlineage_facets_on_complete(task_instance)[source]

Implement _on_complete because job_run needs to be triggered first in execute method.

This should send additional events only if operator wait_for_termination is set to True.

class airflow.providers.dbt.cloud.operators.dbt.DbtCloudGetJobRunArtifactOperator(*, dbt_cloud_conn_id=DbtCloudHook.default_conn_name, run_id, path, account_id=None, step=None, output_file_name=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Download artifacts from a dbt Cloud job run.

See also

For more information on how to use this operator, take a look at the guide: Download run artifacts

Parameters
  • dbt_cloud_conn_id (str) – The connection ID for connecting to dbt Cloud.

  • run_id (int) – The ID of a dbt Cloud job run.

  • path (str) – The file path related to the artifact file. Paths are rooted at the target/ directory. Use “manifest.json”, “catalog.json”, or “run_results.json” to download dbt-generated artifacts for the run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • step (int | None) – Optional. The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.

  • output_file_name (str | None) – Optional. The desired file name for the download artifact file. Defaults to <run_id>_<path> (e.g. “728368_run_results.json”).

template_fields = ('dbt_cloud_conn_id', 'run_id', 'path', 'account_id', 'output_file_name')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.dbt.cloud.operators.dbt.DbtCloudListJobsOperator(*, dbt_cloud_conn_id=DbtCloudHook.default_conn_name, account_id=None, project_id=None, order_by=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

List jobs in a dbt Cloud project.

See also

For more information on how to use this operator, take a look at the guide: List jobs

Retrieves metadata for all jobs tied to a specified dbt Cloud account. If a project_id is supplied, only jobs pertaining to this project id will be retrieved.

Parameters
  • account_id (int | None) – Optional. If an account ID is not provided explicitly, the account ID from the dbt Cloud connection will be used.

  • order_by (str | None) – Optional. Field to order the result by. Use ‘-‘ to indicate reverse order. For example, to use reverse order by the run ID use order_by=-id.

  • project_id (int | None) – Optional. The ID of a dbt Cloud project.

template_fields = ('account_id', 'project_id')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?