Yandex.Cloud Data Proc Operators

Yandex Data Proc is a service that helps you deploy Apache Hadoop®* and Apache Spark™ clusters in the Yandex Cloud infrastructure.

With Data Proc, you can manage the cluster size and node capacity, as well as work with various Apache® services, such as Spark, HDFS, YARN, Hive, HBase, Oozie, Sqoop, Flume, Tez, and Zeppelin.

Apache Hadoop is used for storing and analyzing structured and unstructured big data.

Apache Spark is a tool for quick data processing that can be integrated with Apache Hadoop and other storage systems.

Using the operators

To learn how to use Data Proc operators, see example DAGs.

Was this entry helpful?