Google Cloud Translate Operators¶
Prerequisite Tasks¶
To use these operators, you must do a few things:
Select or create a Cloud Platform project using the Cloud Console.
Enable billing for your project, as described in the Google Cloud documentation.
Enable the API, as described in the Cloud Console documentation.
Install API libraries via pip.
pip install 'apache-airflow[google]'Detailed information is available for Installation.
CloudTranslateTextOperator¶
Translate a string or list of strings.
For parameter definition, take a look at
CloudTranslateTextOperator
Using the operator¶
Basic usage of the operator:
product_set_create = CloudTranslateTextOperator(
task_id="translate",
values=["zażółć gęślą jaźń"],
target_language="en",
format_="text",
source_language=None,
model="base",
)
The result of translation is available as dictionary or array of dictionaries accessible via the usual XCom mechanisms of Airflow:
translation_access = BashOperator(
task_id="access", bash_command="echo '{{ task_instance.xcom_pull(\"translate\")[0] }}'"
)
Templating¶
template_fields: Sequence[str] = (
"values",
"target_language",
"format_",
"source_language",
"model",
"gcp_conn_id",
"impersonation_chain",
)
TranslateTextOperator¶
Translate an array of one or more text (or html) items.
Intended to use for moderate amount of text data, for large volumes please use the
TranslateTextBatchOperator
For parameter definition, take a look at
TranslateTextOperator
Using the operator¶
Basic usage of the operator:
translate_text = TranslateTextOperator(
task_id="translate_v3_op",
contents=["Ciao mondo!", "Mi puoi prendere una tazza di caffè, per favore?"],
source_language_code="it",
target_language_code="en",
)
TranslateTextBatchOperator¶
Translate large amount of text data into up to 10 target languages in a single run. List of files and other options provided by input configuration.
For parameter definition, take a look at
TranslateTextBatchOperator
TranslateCreateDatasetOperator¶
Create a native translation dataset using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateCreateDatasetOperator
Using the operator¶
Basic usage of the operator:
create_dataset_op = TranslateCreateDatasetOperator(
task_id="translate_v3_ds_create",
dataset=DATASET,
project_id=PROJECT_ID,
location=REGION,
)
TranslateImportDataOperator¶
Import data to the existing native dataset, using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateImportDataOperator
Using the operator¶
Basic usage of the operator:
import_ds_data_op = TranslateImportDataOperator(
task_id="translate_v3_ds_import_data",
dataset_id=create_dataset_op.output["dataset_id"],
input_config={
"input_files": [{"usage": "UNASSIGNED", "gcs_source": {"input_uri": DATASET_DATA_PATH}}]
},
project_id=PROJECT_ID,
location=REGION,
)
TranslateDatasetsListOperator¶
Get list of translation datasets using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateDatasetsListOperator
Using the operator¶
Basic usage of the operator:
list_datasets_op = TranslateDatasetsListOperator(
task_id="translate_v3_list_ds",
project_id=PROJECT_ID,
location=REGION,
)
TranslateDeleteDatasetOperator¶
Delete a native translation dataset using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateDeleteDatasetOperator
Using the operator¶
Basic usage of the operator:
delete_ds_op = TranslateDeleteDatasetOperator(
task_id="translate_v3_ds_delete",
dataset_id=create_dataset_op.output["dataset_id"],
project_id=PROJECT_ID,
location=REGION,
)
More information¶
See: Base (V2) Google Cloud Translate documentation. Advanced (V3) Google Cloud Translate (Advanced) documentation. Datasets Legacy and native dataset comparison.