SQLExecuteQueryOperator to connect to Apache Impala¶
Use the SQLExecuteQueryOperator
to execute SQL queries against an
Apache Impala cluster.
Note
Previously, a dedicated operator for Impala might have been used.
After deprecation, please use the SQLExecuteQueryOperator
instead.
Note
Make sure you have installed the apache-airflow-providers-apache-impala
package to enable Impala support.
Using the Operator¶
Use the conn_id
argument to connect to your Apache Impala instance where
the connection metadata is structured as follows:
Parameter |
Input |
---|---|
Host: string |
Impala daemon hostname or IP address |
Schema: string |
The default database name (optional) |
Login: string |
Username for authentication (if applicable) |
Password: string |
Password for authentication (if applicable) |
Port: int |
Impala service port (default: 21050) |
Extra: JSON |
Additional connection configuration, such as:
|
An example usage of the SQLExecuteQueryOperator to connect to Apache Impala is as follows:
tests/system/apache/impala/example_impala.py
create_table_impala_task = SQLExecuteQueryOperator(
task_id="create_table_impala",
sql="""
CREATE TABLE IF NOT EXISTS impala_example (
a STRING,
b INT
)
PARTITIONED BY (c INT)
""",
)
Reference¶
For further information, see:
Note
Parameters provided directly via SQLExecuteQueryOperator() take precedence over those specified
in the Airflow connection metadata (such as schema
, login
, password
, etc).