AWS Glue DataBrew

AWS Glue DataBrew is a visual data preparation tool that makes it easier for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). You can choose from over 250 prebuilt transformations to automate data preparation tasks, all without the need to write any code. You can automate filtering anomalies, converting data to standard formats and correcting invalid values, and other tasks. After your data is ready, you can immediately use it for analytics and ML projects.

Prerequisite Tasks

To use these operators, you must do a few things:

Operators

Start an AWS Glue DataBrew job

To submit a new AWS Glue DataBrew job you can use GlueDataBrewStartJobOperator.

tests/system/providers/amazon/aws/example_glue_databrew.py[source]

start_job = GlueDataBrewStartJobOperator(task_id="startjob", job_name=job_name, delay=15)

Was this entry helpful?