apache-airflow-providers-cncf-kubernetes

Package apache-airflow-providers-cncf-kubernetes

Kubernetes

Release: 5.3.0

Provider package

This is a provider package for cncf.kubernetes provider. All classes for this provider package are in airflow.providers.cncf.kubernetes python package.

Installation

You can install this package on top of an existing Airflow 2 installation (see Requirements below) for the minimum Airflow version supported) via pip install apache-airflow-providers-cncf-kubernetes

Requirements

PIP package

Version required

apache-airflow

>=2.3.0

asgiref

>=3.5.2

cryptography

>=2.0.0

kubernetes

>=21.7.0,<24

kubernetes_asyncio

>=18.20.1,<25

Changelog

5.3.0

Features

  • enhance spark_k8s_operator (#29977)

Bug Fixes

  • Fix KubernetesPodOperator xcom push when 'get_logs=False' (#29052)

  • Fixed hanged KubernetesPodOperator (#28336)

Misc

  • Align cncf provider file names with AIP-21 (#29905)

  • Remove "boilerplate" from all taskflow decorators (#30118)

  • Ensure setup/teardown work on a previously decorated function (#30216)

5.2.2

Bug Fixes

  • 'KubernetesPodOperator._render_nested_template_fields' improved by changing the conditionals for a map (#29760)

5.2.1

Bug Fixes

  • Fix @task.kubernetes to receive input and send output (#28942)

5.2.0

Features

  • Add deferrable mode to ''KubernetesPodOperator'' (#29017)

  • Allow setting the name for the base container within K8s Pod Operator (#28808)

Bug Fixes

  • Patch only single label when marking KPO checked (#29279)

5.1.1

Bug Fixes

  • Fix Incorrect 'await_container_completion' (#28771)

5.1.0

Features

  • Add Flink on K8s Operator  (#28512)

  • Add volume-related nested template fields for KPO (#27719)

  • Allow longer pod names for k8s executor / KPO (#27736)

  • Use labels instead of pod name for pod log read in k8s exec (#28546)

Bug Fixes

  • Patch "checked" when pod not successful (#27845)

  • Keep pod name for k8s executor under 63 characters (#28237)

Misc

  • Remove outdated compat imports/code from providers (#28507)

  • Restructure Docs  (#27235)

5.0.0

This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.

Breaking changes

Previously KubernetesPodOperator considered some settings from the Airflow config’s kubernetes section. Such consideration was deprecated in 4.1.0 and is now removed. If you previously relied on the Airflow config, and you want client generation to have non-default configuration, you will need to define your configuration in an Airflow connection and set KPO to use the connection. See kubernetes provider documentation on defining a kubernetes Airflow connection for details.

Drop support for providing resource as dict in KubernetesPodOperator. You should use container_resources with V1ResourceRequirements.

Param node_selectors has been removed in KubernetesPodOperator; use node_selector instead.

The following backcompat modules for KubernetesPodOperator are removed and you must now use the corresponding objects from the kubernetes library:

  • airflow.providers.cncf.kubernetes.backcompat.pod

  • airflow.providers.cncf.kubernetes.backcompat.pod_runtime_info_env

  • airflow.providers.cncf.kubernetes.backcompat.volume

  • airflow.providers.cncf.kubernetes.backcompat.volume_mount

  • Remove deprecated backcompat objects for KPO (#27518)

  • Remove support for node_selectors param in KPO (#27515)

  • Remove unused backcompat method in k8s hook (#27490)

  • Drop support for providing ''resource'' as dict in ''KubernetesPodOperator'' (#27197)

  • Don't consider airflow core conf for KPO (#26849)

Misc

  • Move min airflow version to 2.3.0 for all providers (#27196)

  • Use log.exception where more economical than log.error (#27517)

Features

KubernetesPodOperator argument name is now optional. Previously, name was a required argument for KubernetesPodOperator when also not supplying pod template or full pod spec. Now, if name is not supplied, task_id will be used.

KubernetesPodOperator argument namespace is now optional. If not supplied via KPO param or pod template file or full pod spec, then we’ll check the airflow conn, then if in a k8s pod, try to infer the namespace from the container, then finally will use the default namespace.

When using an Airflow connection of type kubernetes, if defining the connection in an env var or secrets backend, it’s no longer necessary to prefix the “extra” fields with extra__kubernetes__. If extra contains duplicate fields (one with prefix, one without) then the non-prefixed one will be used.

  • Remove extra__kubernetes__ prefix from k8s hook extras (#27021)

  • Add container_resources as KubernetesPodOperator templatable (#27457)

  • add container_name option for SparkKubernetesSensor (#26560)

  • Allow xcom sidecar container image to be configurable in KPO (#26766)

  • Improve task_id to pod name conversion (#27524)

  • Make pod name optional in KubernetesPodOperator (#27120)

  • Make namespace optional for KPO (#27116)

  • Enable template rendering for env_vars field for the @task.kubernetes decorator (#27433)

Bug Fixes

  • Fix KubernetesHook fail on an attribute absence (#25787)

  • Fix log message for kubernetes hooks (#26999)

  • KPO should use hook's get namespace method to get namespace (#27516)

New deprecations

  • In KubernetesHook.get_namespace, if a connection is defined but a namespace isn’t set, we

    currently return ‘default’; this behavior is deprecated (#27202). In the next release, we’ll return None.

  • Deprecate use of core get_kube_client in PodManager (#26848)

4.4.0

Features

  • feat(KubernetesPodOperator): Add support of container_security_context (#25530)

  • Add @task.kubernetes taskflow decorator (#25663)

  • pretty print KubernetesPodOperator rendered template env_vars (#25850)

Bug Fixes

  • Avoid calculating all elements when one item is needed (#26377)

  • Wait for xcom sidecar container to start before sidecar exec (#25055)

4.3.0

Features

  • Improve taskflow type hints with ParamSpec (#25173)

Bug Fixes

  • Fix xcom_sidecar stuck problem (#24993)

4.2.0

Features

  • Add 'airflow_kpo_in_cluster' label to KPO pods (#24658)

  • Use found pod for deletion in KubernetesPodOperator (#22092)

Bug Fixes

  • Revert "Fix await_container_completion condition (#23883)" (#24474)

  • Update providers to use functools compat for ''cached_property'' (#24582)

Misc

  • Rename 'resources' arg in Kub op to k8s_resources (#24673)

4.1.0

Features

  • Previously, KubernetesPodOperator relied on core Airflow configuration (namely setting for kubernetes executor) for certain settings used in client generation. Now KubernetesPodOperator uses KubernetesHook, and the consideration of core k8s settings is officially deprecated.

  • If you are using the Airflow configuration settings (e.g. as opposed to operator params) to configure the kubernetes client, then prior to the next major release you will need to add an Airflow connection and set your KPO tasks to use that connection.

  • Use KubernetesHook to create api client in KubernetesPodOperator (#20578)

  • [FEATURE] KPO use K8S hook (#22086)

  • Add param docs to KubernetesHook and KubernetesPodOperator (#23955) (#24054)

Bug Fixes

  • Use "remote" pod when patching KPO pod as "checked" (#23676)

  • Don't use the root logger in KPO _suppress function (#23835)

  • Fix await_container_completion condition (#23883)

Misc

  • Migrate Cncf.Kubernetes example DAGs to new design #22441 (#24132)

  • Clean up f-strings in logging calls (#23597)

4.0.2

Bug Fixes

  • Fix: Exception when parsing log #20966 (#23301)

  • Fixed Kubernetes Operator large xcom content Defect  (#23490)

  • Clarify 'reattach_on_restart' behavior (#23377)

4.0.1

Bug Fixes

  • Add k8s container's error message in airflow exception (#22871)

  • KubernetesHook should try incluster first when not otherwise configured (#23126)

  • KubernetesPodOperator should patch "already checked" always (#22734)

  • Delete old Spark Application in SparkKubernetesOperator (#21092)

  • Cleanup dup code now that k8s provider requires 2.3.0+ (#22845)

  • Fix ''KubernetesPodOperator'' with 'KubernetesExecutor'' on 2.3.0 (#23371)

  • Fix KPO to have hyphen instead of period (#22982)

  • Fix new MyPy errors in main (#22884)

4.0.0

Breaking changes

The provider in version 4.0.0 only works with Airflow 2.3+. Please upgrade Airflow to 2.3 version if you want to use the features or fixes in 4.* line of the provider.

The main reason for the incompatibility is using latest Kubernetes Libraries. The cncf.kubernetes provider requires newer version of libraries than Airflow 2.1 and 2.2 used for Kubernetes Executor and that makes the provider incompatible with those Airflow versions.

Features

  • Log traceback only on ''DEBUG'' for KPO logs read interruption (#22595)

  • Update our approach for executor-bound dependencies (#22573)

  • Optionally not follow logs in KPO pod_manager (#22412)

Bug Fixes

  • Stop crashing when empty logs are received from kubernetes client (#22566)

3.1.2 (YANKED)

Bug Fixes

  • Fix mistakenly added install_requires for all providers (#22382)

  • Fix "run_id" k8s and elasticsearch compatibility with Airflow 2.1 (#22385)

Misc

  • Remove RefreshConfiguration workaround for K8s token refreshing (#20759)

3.1.1 (YANKED)

Misc

  • Add Trove classifiers in PyPI (Framework :: Apache Airflow :: Provider)

3.1.0 (YANKED)

Features

  • Add map_index label to mapped KubernetesPodOperator (#21916)

  • Change KubernetesPodOperator labels from execution_date to run_id (#21960)

Misc

  • Support for Python 3.10

  • Fix Kubernetes example with wrong operator casing (#21898)

  • Remove types from KPO docstring (#21826)

3.0.2 (YANKED)

Bug Fixes

  • Add missed deprecations for cncf (#20031)

3.0.1 (YANKED)

Misc

  • Update Kubernetes library version (#18797)

3.0.0

Breaking changes

  • Parameter is_delete_operator_pod default is changed to True (#20575)

  • Simplify KubernetesPodOperator (#19572)

  • Move pod_mutation_hook call from PodManager to KubernetesPodOperator (#20596)

  • Rename ''PodLauncher'' to ''PodManager'' (#20576)

Parameter is_delete_operator_pod has new default

Previously, the default for param is_delete_operator_pod was False, which means that after a task runs, its pod is not deleted by the operator and remains on the cluster indefinitely. With this release, we change the default to True.

Notes on changes KubernetesPodOperator and PodLauncher

Warning

Many methods in KubernetesPodOperator and PodLauncher have been renamed. If you have subclassed KubernetesPodOperator you will need to update your subclass to reflect the new structure. Additionally PodStatus enum has been renamed to PodPhase.

Overview

Generally speaking if you did not subclass KubernetesPodOperator and you did not use the PodLauncher class directly, then you don’t need to worry about this change. If however you have subclassed KubernetesPodOperator, what follows are some notes on the changes in this release.

One of the principal goals of the refactor is to clearly separate the “get or create pod” and “wait for pod completion” phases. Previously the “wait for pod completion” logic would be invoked differently depending on whether the operator were to “attach to an existing pod” (e.g. after a worker failure) or “create a new pod” and this resulted in some code duplication and a bit more nesting of logic. With this refactor we encapsulate the “get or create” step into method KubernetesPodOperator.get_or_create_pod, and pull the monitoring and XCom logic up into the top level of execute because it can be the same for “attached” pods and “new” pods.

The KubernetesPodOperator.get_or_create_pod tries first to find an existing pod using labels specific to the task instance (see KubernetesPodOperator.find_pod). If one does not exist it creates a pod <~.PodManager.create_pod>.

The “waiting” part of execution has three components. The first step is to wait for the pod to leave the Pending phase (~.KubernetesPodOperator.await_pod_start). Next, if configured to do so, the operator will follow the base container logs and forward these logs to the task logger until the base container is done. If not configured to harvest the logs, the operator will instead KubernetesPodOperator.await_container_completion either way, we must await container completion before harvesting xcom. After (optionally) extracting the xcom value from the base container, we await pod completion <~.PodManager.await_pod_completion>.

Previously, depending on whether the pod was “reattached to” (e.g. after a worker failure) or created anew, the waiting logic may have occurred in either handle_pod_overlap or create_new_pod_for_operator.

After the pod terminates, we execute different cleanup tasks depending on whether the pod terminated successfully.

If the pod terminates unsuccessfully, we attempt to log the pod events PodLauncher.read_pod_events>. If additionally the task is configured not to delete the pod after termination, we apply a label KubernetesPodOperator.patch_already_checked> indicating that the pod failed and should not be “reattached to” in a retry. If the task is configured to delete its pod, we delete it KubernetesPodOperator.process_pod_deletion>. Finally, we raise an AirflowException to fail the task instance.

If the pod terminates successfully, we delete the pod KubernetesPodOperator.process_pod_deletion> (if configured to delete the pod) and push XCom (if configured to push XCom).

Details on method renames, refactors, and deletions

In KubernetesPodOperator:

  • Method create_pod_launcher is converted to cached property pod_manager

  • Construction of k8s CoreV1Api client is now encapsulated within cached property client

  • Logic to search for an existing pod (e.g. after an airflow worker failure) is moved out of execute and into method find_pod.

  • Method handle_pod_overlap is removed. Previously it monitored a “found” pod until completion. With this change the pod monitoring (and log following) is orchestrated directly from execute and it is the same whether it’s a “found” pod or a “new” pod. See methods await_pod_start, follow_container_logs, await_container_completion and await_pod_completion.

  • Method create_pod_request_obj is renamed build_pod_request_obj. It now takes argument context in order to add TI-specific pod labels; previously they were added after return.

  • Method create_labels_for_pod is renamed _get_ti_pod_labels. This method doesn’t return all labels, but only those specific to the TI. We also add parameter include_try_number to control the inclusion of this label instead of possibly filtering it out later.

  • Method _get_pod_identifying_label_string is renamed _build_find_pod_label_selector

  • Method _try_numbers_match is removed.

  • Method create_new_pod_for_operator is removed. Previously it would mutate the labels on self.pod, launch the pod, monitor the pod to completion etc. Now this logic is in part handled by get_or_create_pod, where a new pod will be created if necessary. The monitoring etc is now orchestrated directly from execute. Again, see the calls to methods await_pod_start, follow_container_logs, await_container_completion and await_pod_completion.

In class PodManager (formerly PodLauncher):

  • Method start_pod is removed and split into two methods: create_pod and await_pod_start.

  • Method monitor_pod is removed and split into methods follow_container_logs, await_container_completion, await_pod_completion

  • Methods pod_not_started, pod_is_running, process_status, and _task_status are removed. These were needed due to the way in which pod phase was mapped to task instance states; but we no longer do such a mapping and instead deal with pod phases directly and untransformed.

  • Method _extract_xcom is renamed extract_xcom.

  • Method read_pod_logs now takes kwarg container_name

Other changes in pod_manager.py (formerly pod_launcher.py):

  • Class pod_launcher.PodLauncher renamed to pod_manager.PodManager

  • Enum-like class PodStatus is renamed PodPhase, and the values are no longer lower-cased.

  • The airflow.settings.pod_mutation_hook is no longer called in cncf.kubernetes.utils.pod_manager.PodManager.run_pod_async. For KubernetesPodOperator, mutation now occurs in build_pod_request_obj.

  • Parameter is_delete_operator_pod default is changed to True so that pods are deleted after task completion and not left to accumulate. In practice it seems more common to disable pod deletion only on a temporary basis for debugging purposes and therefore pod deletion is the more sensible default.

Features

  • Add params config, in_cluster, and cluster_context to KubernetesHook (#19695)

  • Implement dry_run for KubernetesPodOperator (#20573)

  • Clarify docstring for ''build_pod_request_obj'' in K8s providers (#20574)

Bug Fixes

  • Fix Volume/VolumeMount KPO DeprecationWarning (#19726)

2.2.0

Features

  • Added namespace as a template field in the KPO. (#19718)

  • Decouple name randomization from name kwarg (#19398)

Bug Fixes

  • Checking event.status.container_statuses before filtering (#19713)

  • Coalesce 'extra' params to None in KubernetesHook (#19694)

  • Change to correct type in KubernetesPodOperator (#19459)

2.1.0

Features

  • Add more type hints to PodLauncher (#18928)

  • Add more information to PodLauncher timeout error (#17953)

2.0.3

Bug Fixes

  • Fix KubernetesPodOperator reattach when not deleting pods (#18070)

  • Make Kubernetes job description fit on one log line (#18377)

  • Do not fail KubernetesPodOperator tasks if log reading fails (#17649)

2.0.2

Bug Fixes

  • Fix using XCom with ''KubernetesPodOperator'' (#17760)

  • Import Hooks lazily individually in providers manager (#17682)

2.0.1

Features

  • Enable using custom pod launcher in Kubernetes Pod Operator (#16945)

Bug Fixes

  • BugFix: Using 'json' string in template_field causes issue with K8s Operators (#16930)

2.0.0

Breaking changes

  • Auto-apply apply_default decorator (#15667)

Warning

Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+. If your Airflow version is < 2.1.0, and you want to install this provider version, first upgrade Airflow to at least version 2.1.0. Otherwise your Airflow package version will be upgraded automatically and you will have to manually run airflow upgrade db to complete the migration.

Features

  • Add 'KubernetesPodOperator' 'pod-template-file' jinja template support (#15942)

  • Save pod name to xcom for KubernetesPodOperator (#15755)

Bug Fixes

  • Bug Fix Pod-Template Affinity Ignored due to empty Affinity K8S Object (#15787)

  • Bug Pod Template File Values Ignored (#16095)

  • Fix issue with parsing error logs in the KPO (#15638)

  • Fix unsuccessful KubernetesPodOperator final_state call when 'is_delete_operator_pod=True' (#15490)

1.2.0

Features

  • Require 'name' with KubernetesPodOperator (#15373)

  • Change KPO node_selectors warning to proper deprecationwarning (#15507)

Bug Fixes

  • Fix timeout when using XCom with KubernetesPodOperator (#15388)

  • Fix labels on the pod created by ''KubernetesPodOperator'' (#15492)

1.1.0

Features

  • Separate Kubernetes pod_launcher from core airflow (#15165)

  • Add ability to specify api group and version for Spark operators (#14898)

  • Use libyaml C library when available. (#14577)

1.0.2

Bug fixes

  • Allow pod name override in KubernetesPodOperator if pod_template is used. (#14186)

  • Allow users of the KPO to *actually* template environment variables (#14083)

1.0.1

Updated documentation and readme files.

Bug fixes

  • Pass image_pull_policy in KubernetesPodOperator correctly (#13289)

1.0.0

Initial version of the provider.

Was this entry helpful?