Apache Spark is a general framework for
distributed computing that offers high performance for both batch and interactive processing. It exposes APIs for Java, Python, and Scala and consists of Spark core and several related
- Spark SQL - Module for working with structured data. Allows you to seamlessly mix SQL queries with Spark
- Spark Streaming - API that allows you to build scalable fault-tolerant streaming applications.
- MLlib - API that implements common machine learning algorithms.
- GraphX - API for graphs and graph-parallel computation.
Cloudera supports Spark core, Spark SQL (including DataFrames), Spark Streaming, and MLlib. Cloudera does not currently offer commercial support for