Apache Hive Operators

The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage.

HiveOperator

This operator executes hql code or hive script in a specific Hive database.

tests/system/providers/apache/hive/example_twitter_dag.py[source]

    load_to_hive = HiveOperator(
        task_id=f"load_{channel}_to_hive",
        hql=(
            f"LOAD DATA INPATH '{hdfs_dir}{channel}/{file_name}'"
            f"INTO TABLE {channel}"
            f"PARTITION(dt='{dt}')"
        ),
    )

Reference

For more information check Apache Hive documentation.

Was this entry helpful?