DatabricksCopyIntoOperator¶
Use the DatabricksCopyIntoOperator
to import
data into Databricks table using COPY INTO
command.
Using the Operator¶
Operator loads data from a specified location into a table using a configured endpoint. The only required parameters are:
table_name
- string with the table namefile_location
- string with the URI of data to loadfile_format
- string specifying the file format of data to load. Supported formats areCSV
,JSON
,AVRO
,ORC
,PARQUET
,TEXT
,BINARYFILE
.One of
sql_endpoint_name
(name of Databricks SQL endpoint to use) orhttp_path
(HTTP path for Databricks SQL endpoint or Databricks cluster).
Other parameters are optional and could be found in the class documentation.
Examples¶
Importing CSV data¶
An example usage of the DatabricksCopyIntoOperator to import CSV data into a table is as follows:
# Example of importing data using COPY_INTO SQL command
import_csv = DatabricksCopyIntoOperator(
task_id="import_csv",
databricks_conn_id=connection_id,
sql_endpoint_name=sql_endpoint_name,
table_name="my_table",
file_format="CSV",
file_location="abfss://container@account.dfs.core.windows.net/my-data/csv",
format_options={"header": "true"},
force_copy=True,
)