airflow.providers.standard.operators.bash

Module Contents

Classes

BashOperator

Execute a Bash script, command or set of commands.

class airflow.providers.standard.operators.bash.BashOperator(*, bash_command, env=None, append_env=False, output_encoding='utf-8', skip_exit_code=None, skip_on_exit_code=99, cwd=None, output_processor=lambda result: ..., **kwargs)[source]

Bases: airflow.models.baseoperator.BaseOperator

Execute a Bash script, command or set of commands.

See also

For more information on how to use this operator, take a look at the guide: BashOperator

If BaseOperator.do_xcom_push is True, the last line written to stdout will also be pushed to an XCom when the bash command completes

Parameters
  • bash_command (str | airflow.utils.types.ArgNotSet) – The command, set of commands or reference to a Bash script (must be ‘.sh’ or ‘.bash’) to be executed. (templated)

  • env (dict[str, str] | None) – If env is not None, it must be a dict that defines the environment variables for the new process; these are used instead of inheriting the current process environment, which is the default behavior. (templated)

  • append_env (bool) – If False(default) uses the environment variables passed in env params and does not inherit the current process environment. If True, inherits the environment variables from current passes and then environment variable passed by the user will either update the existing inherited environment variables or the new variables gets appended to it

  • output_encoding (str) – Output encoding of Bash command

  • skip_on_exit_code (int | collections.abc.Container[int] | None) – If task exits with this exit code, leave the task in skipped state (default: 99). If set to None, any non-zero exit code will be treated as a failure.

  • cwd (str | None) – Working directory to execute the command in (templated). If None (default), the command is run in a temporary directory. To use current DAG folder as the working directory, you might set template {{ dag_run.dag.folder }}. When bash_command is a ‘.sh’ or ‘.bash’ file, Airflow must have write access to the working directory. The script will be rendered (Jinja template) into a new temporary file in this directory.

  • output_processor (Callable[[str], Any]) – Function to further process the output of the bash script (default is lambda output: output).

Airflow will evaluate the exit code of the Bash command. In general, a non-zero exit code will result in task failure and zero will result in task success. Exit code 99 (or another set in skip_on_exit_code) will throw an airflow.exceptions.AirflowSkipException, which will leave the task in skipped state. You can have all non-zero exit codes be treated as a failure by setting skip_on_exit_code=None.

Exit code

Behavior

0

success

skip_on_exit_code (default: 99)

raise airflow.exceptions.AirflowSkipException

otherwise

raise airflow.exceptions.AirflowException

Note

Airflow will not recognize a non-zero exit code unless the whole shell exit with a non-zero exit code. This can be an issue if the non-zero exit arises from a sub-command. The easiest way of addressing this is to prefix the command with set -e;

bash_command = "set -e; python3 script.py '{{ data_interval_end }}'"

Note

To simply execute a .sh or .bash script (without any Jinja template), add a space after the script name bash_command argument – for example bash_command="my_script.sh ". This is because Airflow tries to load this file and process it as a Jinja template when it ends with .sh or .bash.

If you have Jinja template in your script, do not put any blank space. And add the script’s directory in the DAG’s template_searchpath. If you specify a cwd, Airflow must have write access to this directory. The script will be rendered (Jinja template) into a new temporary file in this directory.

Warning

Care should be taken with “user” input or when using Jinja templates in the bash_command, as this bash operator does not perform any escaping or sanitization of the command.

This applies mostly to using “dag_run” conf, as that can be submitted via users in the Web UI. Most of the default template variables are not at risk.

For example, do not do this:

bash_task = BashOperator(
    task_id="bash_task",
    bash_command='echo "Here is the message: \'{{ dag_run.conf["message"] if dag_run else "" }}\'"',
)

Instead, you should pass this via the env kwarg and use double-quotes inside the bash_command, as below:

bash_task = BashOperator(
    task_id="bash_task",
    bash_command="echo \"here is the message: '$message'\"",
    env={"message": '{{ dag_run.conf["message"] if dag_run else "" }}'},
)

New in version 2.10.0: The output_processor parameter.

template_fields: collections.abc.Sequence[str] = ('bash_command', 'env', 'cwd')[source]
template_fields_renderers[source]
template_ext: collections.abc.Sequence[str] = ('.sh', '.bash')[source]
ui_color = '#f0ede4'[source]
subprocess_hook()[source]

Returns hook for running the bash command.

get_env(context)[source]

Build the set of environment variables to be exposed for the bash command.

execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

on_kill()[source]

Override this method to clean up subprocesses when a task instance gets killed.

Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.

Was this entry helpful?