apache-airflow-providers-apache-hive

Changelog

8.0.0

Breaking changes

Changed the default value of use_beeline in hive cli connection to True. Beeline will be always enabled by default in this connection type.

Removed deprecated parameter authMechanism from HiveHook and dependent operators. Use auth_mechanism instead in your extra.

HiveOperator: Removed the method get_hook in favor of hook property instead.

HiveStatsCollectionOperator: Removed the deprecated col_blacklist in favor of excluded_columns.

  • Setting use_beeline by default for hive cli connection (#38763)

  • Removing deprecated code in hive provider (#38859)

Features

  • Adding support to hive hook for high availability Hive installations (#38651)

7.0.1

Misc

  • Remove references from the code to Jira Issues (#37807)

  • Unify 'aws_conn_id' type to always be 'str | None' (#37768)

  • Limit 'pandas' to '<2.2' (#37748)

7.0.0

Breaking changes

Remove the ability of specify a proxy user as an owner or login or as_param in the connection. Now, setting the user in Proxy User connection parameter or passing proxy_user to HiveHook will do the job.

  • `` Simplify hive client connection (#37043)``

Misc

  • Fix pyhive hive_pure_sasl extra name (#37323)

6.4.2

Bug Fixes

  • Fix assignment of template field in '__init__' in 'hive-stats' (#36905)

Misc

  • Set min pandas dependency to 1.2.5 for all providers and airflow (#36698)

6.4.1

Bug Fixes

  • Fix assignment of template field in '__init__' in 'hive_to_samba.py' (#36486)

6.4.0

Features

  • Add param proxy user for hive (#36221)

Misc

  • Add code snippet formatting in docstrings via Ruff (#36262)

6.3.0

Note

This release of provider is only available for Airflow 2.6+ as explained in the Apache Airflow providers support policy.

Misc

  • Bump minimum Airflow version in providers to Airflow 2.6.0 (#36017)

6.2.0

Note

This release of provider is only available for Airflow 2.5+ as explained in the Apache Airflow providers support policy.

Misc

  • Bump min airflow version of providers (#34728)

  • Consolidate hook management in HiveOperator (#34430)

6.1.6

Misc

  • Refactor regex in providers (#33898)

  • Replace sequence concatenation by unpacking in Airflow providers (#33933)

  • Replace single element slice by next() in hive provider (#33937)

  • Use a single  statement with multiple contexts instead of nested  statements in providers (#33768)

  • Use startswith once with a tuple in Hive hook (#33765)

  • Refactor: Simplify a few loops (#33736)

  • E731: replace lambda by a def method in Airflow providers (#33757)

  • Use f-string instead of  in Airflow providers (#33752)

6.1.5

Note

The provider now uses pure-sasl, a pure-Python implementation of SASL, which is better maintained than previous sasl implementation, even if a bit slower for sasl interface. It also allows hive to be installed for Python 3.11.

Misc

  • Bring back hive support for Python 3.11 (#32607)

  • Refactor: Simplify code in Apache/Alibaba providers (#33227)

  • Simplify 'X for X in Y' to 'Y' where applicable (#33453)

  • Replace OrderedDict with plain dict (#33508)

  • Simplify code around enumerate (#33476)

  • Use str.splitlines() to split lines in providers (#33593)

  • Simplify conditions on len() in providers/apache (#33564)

  • Replace repr() with proper formatting (#33520)

  • Avoid importing pandas and numpy in runtime and module level (#33483)

  • Consolidate import and usage of pandas (#33480)

6.1.4

Misc

  • Bring back mysql-connector-python as required depednency (#32989)

6.1.3

Bug Fixes

  • Fix Pandas2 compatibility for Hive (#32752)

Misc

  • Add more accurate typing for DbApiHook.run method (#31846)

  • Move Hive configuration to Apache Hive provider (#32777)

6.1.2

Bug Fixes

  • Add proxy_user template check (#32334)

6.1.1

Note

This release dropped support for Python 3.7

Bug Fixes

  • Sanitize beeline principal parameter (#31983)

Misc

  • Replace unicodecsv with standard csv library (#31693)

6.1.0

Note

This release of provider is only available for Airflow 2.4+ as explained in the Apache Airflow providers support policy.

Misc

  • Bump minimum Airflow version in providers (#30917)

  • Update return types of 'get_key' methods on 'S3Hook' (#30923)

6.0.0

Breaking changes

The auth option is moved from the extra field to the auth parameter in the Hook. If you have extra parameters defined in your connections as auth, you should move them to the DAG where your HiveOperator or other Hive related operators are used.

  • Move auth parameter from extra to Hook parameter (#30212)

5.1.3

Bug Fixes

  • Validate Hive Beeline parameters (#29502)

5.1.2

Misc

  • Fixed MyPy errors introduced by new mysql-connector-python (#28995)

5.1.1

Bug Fixes

  • Move local_infile option from extra to hook parameter (#28811)

5.1.0

Features

The apache.hive provider provides now hive macros that used to be provided by Airflow. As of 5.1.0 version of apache.hive the hive macros are provided by the Provider.

  • Move Hive macros to the provider (#28538)

  • Make pandas dependency optional for Amazon Provider (#28505)

5.0.0

Breaking changes

The hive_cli_params from connection were moved to the Hook. If you have extra parameters defined in your connections as hive_cli_params extra, you should move them to the DAG where your HiveOperator is used.

  • Move hive_cli_params to hook parameters (#28101)

Features

  • Improve filtering for invalid schemas in Hive hook (#27808)

4.1.1

Bug Fixes

  • Bump common.sql provider to 1.3.1 (#27888)

4.1.0

Note

This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.

Misc

  • Move min airflow version to 2.3.0 for all providers (#27196)

Bug Fixes

  • Filter out invalid schemas in Hive hook (#27647)

4.0.1

Misc

  • Add common-sql lower bound for common-sql (#25789)

4.0.0

Breaking Changes

  • The hql parameter in get_records of HiveServer2Hook has been renamed to sql to match the get_records DbApiHook signature. If you used it as a positional parameter, this is no change for you, but if you used it as keyword one, you need to rename it.

  • hive_conf parameter has been renamed to parameters and it is now second parameter, to match get_records signature from the DbApiHook. You need to rename it if you used it.

  • schema parameter in get_records is an optional kwargs extra parameter that you can add, to match the schema of get_records from DbApiHook.

  • Deprecate hql parameters and synchronize DBApiHook method APIs (#25299)

  • Remove Smart Sensors (#25507)

3.1.0

Features

  • Move all SQL classes to common-sql provider (#24836)

Bug Fixes

  • fix connection extra parameter 'auth_mechanism' in 'HiveMetastoreHook' and 'HiveServer2Hook' (#24713)

3.0.0

Breaking changes

Note

This release of provider is only available for Airflow 2.2+ as explained in the Apache Airflow providers support policy.

Misc

  • chore: Refactoring and Cleaning Apache Providers (#24219)

  • AIP-47 - Migrate hive DAGs to new design #22439 (#24204)

2.3.3

Bug Fixes

  • Fix HiveToMySqlOperator's wrong docstring (#23316)

2.3.2

Bug Fixes

  • Fix mistakenly added install_requires for all providers (#22382)

2.3.1

Misc

  • Add Trove classifiers in PyPI (Framework :: Apache Airflow :: Provider)

2.3.0

Features

  • Set larger limit get_partitions_by_filter in HiveMetastoreHook (#21504)

Bug Fixes

  • Fix Python 3.9 support in Hive (#21893)

  • Fix key typo in 'template_fields_renderers' for 'HiveOperator' (#21525)

Misc

  • Support for Python 3.10

  • Add how-to guide for hive operator (#21590)

2.2.0

Features

  • Add more SQL template fields renderers (#21237)

  • Add conditional 'template_fields_renderers' check for new SQL lexers (#21403)

2.1.0

Features

  • hive provider: restore HA support for metastore (#19777)

Bug Fixes

2.0.3

Bug Fixes

  • fix get_connections deprecation warn in hivemetastore hook (#18854)

2.0.2

Bug fixes

  • HiveHook fix get_pandas_df() failure when it tries to read an empty table (#17777)

Misc

  • Optimise connection importing for Airflow 2.2.0

2.0.1

Features

  • Add Python 3.9 support (#15515)

2.0.0

Breaking changes

  • Auto-apply apply_default decorator (#15667)

Warning

Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+. If your Airflow version is < 2.1.0, and you want to install this provider version, first upgrade Airflow to at least version 2.1.0. Otherwise your Airflow package version will be upgraded automatically and you will have to manually run airflow upgrade db to complete the migration.

1.0.3

Bug fixes

  • Fix mistake and typos in doc/docstrings (#15180)

  • Fix grammar and remove duplicate words (#14647)

  • Resolve issue related to HiveCliHook kill (#14542)

1.0.2

Bug fixes

  • Corrections in docs and tools after releasing provider RCs (#14082)

1.0.1

Updated documentation and readme files.

Bug fixes

  • Remove password if in LDAP or CUSTOM mode HiveServer2Hook (#11767)

1.0.0

Initial version of the provider.

Was this entry helpful?