airflow.models.serialized_dag
¶
Serialized DAG table in database.
Module Contents¶
-
class
airflow.models.serialized_dag.
SerializedDagModel
(dag: DAG)[source]¶ Bases:
airflow.models.base.Base
A table for serialized DAGs.
serialized_dag table is a snapshot of DAG files synchronized by scheduler. This feature is controlled by:
[core] min_serialized_dag_update_interval = 30
(s): serialized DAGs are updated in DB when a file gets processed by scheduler, to reduce DB write rate, there is a minimal interval of updating serialized DAGs.[scheduler] dag_dir_list_interval = 300
(s): interval of deleting serialized DAGs in DB when the files are deleted, suggest to use a smaller interval such as 60
It is used by webserver to load dags because reading from database is lightweight compared to importing from files, it solves the webserver scalability issue.
-
classmethod
write_dag
(cls, dag: DAG, min_update_interval: Optional[int] = None, session: Session = None)[source]¶ Serializes a DAG and writes it into database. If the record already exists, it checks if the Serialized DAG changed or not. If it is changed, it updates the record, ignores otherwise.
- Parameters
dag -- a DAG to be written into database
min_update_interval -- minimal interval in seconds to update serialized DAG
session -- ORM Session
- Returns
Boolean indicating if the DAG was written to the DB
-
classmethod
read_all_dags
(cls, session: Session = None)[source]¶ Reads all DAGs in serialized_dag table.
- Parameters
session -- ORM Session
- Returns
a dict of DAGs read from database
-
classmethod
remove_dag
(cls, dag_id: str, session: Session = None)[source]¶ Deletes a DAG with given dag_id. :param dag_id: dag_id to be deleted :param session: ORM Session
-
classmethod
remove_deleted_dags
(cls, alive_dag_filelocs: List[str], session=None)[source]¶ Deletes DAGs not included in alive_dag_filelocs.
- Parameters
alive_dag_filelocs -- file paths of alive DAGs
session -- ORM Session
-
classmethod
has_dag
(cls, dag_id: str, session: Session = None)[source]¶ Checks a DAG exist in serialized_dag table.
- Parameters
dag_id -- the DAG to check
session -- ORM Session
-
classmethod
get
(cls, dag_id: str, session: Session = None)[source]¶ Get the SerializedDAG for the given dag ID. It will cope with being passed the ID of a subdag by looking up the root dag_id from the DAG table.
- Parameters
dag_id -- the DAG to fetch
session -- ORM Session
-
static
bulk_sync_to_db
(dags: List[DAG], session: Session = None)[source]¶ Saves DAGs as Serialized DAG objects in the database. Each DAG is saved in a separate database query.
- Parameters
dags (List[airflow.models.dag.DAG]) -- the DAG objects to save to the DB
session (Session) -- ORM Session
- Returns
None
-
classmethod
get_last_updated_datetime
(cls, dag_id: str, session: Session = None)[source]¶ Get the date when the Serialized DAG associated to DAG was last updated in serialized_dag table
- Parameters
dag_id (str) -- DAG ID
session (Session) -- ORM Session
-
classmethod
get_max_last_updated_datetime
(cls, session: Session = None)[source]¶ Get the maximum date when any DAG was last updated in serialized_dag table
- Parameters
session (Session) -- ORM Session