CDS Powered by Apache Spark Requirements

The following sections describe software requirements for CDS Powered by Apache Spark.

CDH Versions

Supported versions of CDH are described below.

The latest release (release 2) addresses a Hive compatibility issue that affects CDH 5.10.1 and higher, CDH 5.9.2 and higher, CDH 5.8.5 and higher, and CDH 5.7.6 and higher. If you are using one of these CDH versions, you must upgrade to the Spark 2.0 release 2 parcel to avoid Spark 2 job failures when using Hive functionality.

CDS 2 Powered by Apache Spark Version CDH Version
2.0 Release 2 CDH 5.7, CDH 5.8, CDH 5.9, CDH 5.10
2.0 Release 1 CDH 5.7 up to 5.7.5, CDH 5.8 up to 5.8.4, CDH 5.9 up to 5.9.1, CDH 5.10.0. Spark 2.0 Release 2 is required for any higher maintenance releases in any of these CDH versions.

A Spark 1.6 service can co-exist on the same cluster as Spark 2. The two services are configured to not conflict and run on the same YARN cluster. Spark 2 uses the external shuffle service from the CDH installation if Spark 1 is already installed, or installs the shuffle service itself if necessary. Only the external shuffle service classes from the CDH installation can be used.

Cloudera Manager Versions

Applicable versions of Cloudera Manager for Spark 2 are described below.

CDS 2 Powered by Apache Spark Version Cloudera Manager Version
2.0 Release 2 Cloudera Manager 5.8.3, 5.9 and higher
2.0 Release 1 Cloudera Manager 5.8.3, 5.9 and higher

Scala 2.11 Requirement

Spark 2 does not work with Scala 2.10. Use Scala 2.11 only.