Yandex.Cloud Data Proc Operators¶
Yandex Data Proc is a service that helps you deploy Apache Hadoop®* and Apache Spark™ clusters in the Yandex Cloud infrastructure.
With Data Proc, you can manage the cluster size and node capacity, as well as work with various Apache® services, such as Spark, HDFS, YARN, Hive, HBase, Oozie, Sqoop, Flume, Tez, and Zeppelin.
Apache Hadoop is used for storing and analyzing structured and unstructured big data.
Apache Spark is a tool for quick data processing that can be integrated with Apache Hadoop and other storage systems.
Using the operators¶
To learn how to use Data Proc operators, see example DAGs.