Running Airflow in Docker¶
This quick-start guide will allow you to quickly start Airflow with CeleryExecutor in Docker. This is the fastest way to start Airflow.
Before you begin¶
Follow these steps to install the necessary tools.
Install Docker Community Edition (CE) on your workstation.
Install Docker Compose v1.27.0 and newer on your workstation.
Older versions of docker-compose
do not support all features required by docker-compose.yaml
file, so double check that it meets the minimum version requirements.
docker-compose.yaml
¶
To deploy Airflow on Docker Compose, you should fetch docker-compose.yaml.
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.0.1/docker-compose.yaml'
This file contains several service definitions:
airflow-scheduler
- The scheduler monitors all tasks and DAGs, then triggers the task instances once their dependencies are complete.airflow-webserver
- The webserver available athttp://localhost:8080
.airflow-worker
- The worker that executes the tasks given by the scheduler.airflow-init
- The initialization service.flower
- The flower app for monitoring the environment. It is available athttp://localhost:8080
.postgres
- The database.redis
- The redis - broker that forwards messages from scheduler to worker.
All these services allow you to run Airflow with CeleryExecutor. For more information, see Basic Airflow architecture.
Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
./dags
- you can put your DAG files here../logs
- contains logs from task execution and scheduler../plugins
- you can put your custom plugins here.
Initializing Environment¶
Before starting Airflow for the first time, You need to prepare your environment, i.e. create the necessary files, directories and initialize the database.
On Linux, the mounted volumes in container use the native Linux filesystem user/group permissions, so you have to make sure the container and host computer have matching file permissions.
mkdir ./dags ./logs ./plugins
echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
On all operating system, you need to run database migrations and create the first user account. To do it, run.
docker compose up airflow-init
After initialization is complete, you should see a message like below.
airflow-init_1 | Upgrades done
airflow-init_1 | Admin user airflow created
airflow-init_1 | 2.0.1
start_airflow-init_1 exited with code 0
The account created has the login airflow
and the password airflow
.
Running Airflow¶
Now you can start all services:
docker compose up
In the second terminal you can check the condition of the containers and make sure that no containers are in unhealthy condition:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
247ebe6cf87a apache/airflow:2.0.1 "/usr/bin/dumb-init …" 3 minutes ago Up 3 minutes 8080/tcp compose_airflow-worker_1
ed9b09fc84b1 apache/airflow:2.0.1 "/usr/bin/dumb-init …" 3 minutes ago Up 3 minutes 8080/tcp compose_airflow-scheduler_1
65ac1da2c219 apache/airflow:2.0.1 "/usr/bin/dumb-init …" 3 minutes ago Up 3 minutes (healthy) 0.0.0.0:5555->5555/tcp, 8080/tcp compose_flower_1
7cb1fb603a98 apache/airflow:2.0.1 "/usr/bin/dumb-init …" 3 minutes ago Up 3 minutes (healthy) 0.0.0.0:8080->8080/tcp compose_airflow-webserver_1
74f3bbe506eb postgres:13 "docker-entrypoint.s…" 18 minutes ago Up 17 minutes (healthy) 5432/tcp compose_postgres_1
0bd6576d23cb redis:latest "docker-entrypoint.s…" 10 hours ago Up 17 minutes (healthy) 0.0.0.0:6379->6379/tcp compose_redis_1
Accessing the environment¶
After starting Airflow, you can interact with it in 3 ways;
by running CLI commands.
via a browser using the web interface.
using the REST API.
Running the CLI commands¶
You can also run CLI commands, but you have to do it in one of the defined airflow-*
services. For example, to run airflow info
, run the following command:
docker compose run airflow-worker airflow info
If you have Linux or Mac OS, you can make your work easier and download a optional wrapper scripts that will allow you to run commands with a simpler command.
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.0.1/airflow.sh'
chmod +x airflow.sh
Now you can run commands easier.
./airflow.sh info
You can also use bash
as parameter to enter interactive bash shell in the container or python
to enter
python container.
./airflow.sh bash
./airflow.sh python
Accessing the web interface¶
Once the cluster has started up, you can log in to the web interface and try to run some tasks.
The webserver available at: http://localhost:8080
.
The default account has the login airflow
and the password airflow
.
Sending requests to the REST API¶
Basic username password authentication is currently supported for the REST API, which means you can use common tools to send requests to the API.
The webserver available at: http://localhost:8080
.
The default account has the login airflow
and the password airflow
.
Here is a sample curl
command, which sends a request to retrieve a pool list:
ENDPOINT_URL="http://localhost:8080/"
curl -X GET \
--user "airflow:airflow" \
"${ENDPOINT_URL}/api/v1/pools"
Cleaning up¶
To stop and delete containers, delete volumes with database data and download images, run:
docker compose down --volumes --rmi all
Notes¶
By default, the Docker Compose file uses the latest Airflow image (apache/airflow). If you need, you can customize and extend it.
What’s Next?¶
From this point, you can head to the Tutorial section for further examples or the How-to Guides section if you’re ready to get your hands dirty.