Yandex.Cloud Data Proc Operators¶

Yandex Data Proc is a service that helps you deploy Apache Hadoop®* and Apache Spark™ clusters in the Yandex Cloud infrastructure.

With Data Proc, you can manage the cluster size and node capacity, as well as work with various Apache® services, such as Spark, HDFS, YARN, Hive, HBase, Oozie, Sqoop, Flume, Tez, and Zeppelin.

Apache Hadoop is used for storing and analyzing structured and unstructured big data.

Apache Spark is a tool for quick data processing that can be integrated with Apache Hadoop and other storage systems.

Using the operators¶

To learn how to use Data Proc operators, see example DAGs.