Install and Upgrade Notes

Upgrades from Cloudera Enterprise 5.15 to 6.0.0 Not supported

You cannot upgrade to Cloudera Manager or CDH 6.0.0 from Cloudera Manager or CDH 5.15.

Upgrades from Cloudera Enterprise 6.0 Beta Release to 6.x General Release Not supported

You cannot upgrade to any Cloudera Manager or CDH 6.x general release from the Cloudera Manager or CDH 6.0 Beta release.

Cloudera Data Science Workbench 1.4.x (and lower) is Not Supported with Cloudera Enterprise 6

Cloudera Data Science Workbench (1.4.x and lower) is not currently supported with Cloudera Manager 6.0.x and CDH 6.0.x. If you try to upgrade, you will be asked to remove the CDSW service from your cluster. Cloudera Data Science Workbench will be supported with Cloudera Enterprise 6 in a future release.

Hue requires manual installation of psycopg2

If you are installing or upgrading to CDH 6.0.0 and using the PostgreSQL database for the Hue database, you must install psycopg2 2.5.4 or higher on all Hue hosts. SeeInstalling the psycopg2 Python package.

Cloudera Issue: OPSAPS-47080

CDH Upgrade fails to delete Solr data from HDFS

The CDH upgrade process fails to delete Solr data from HDFS and the recreated collections fail to be initialized due to the existing indexes.

Workaround: Perform the following steps after you run the CDH Upgrade wizard and before you finalize the HDFS upgrade:
  1. Log in to the Cloudera Manager Admin Console.
  2. Go to the Solr service page.
  3. Stop the Solr service and dependent services. Click Actions > Stop.
  4. Click Actions > Reinitialize Solr State for Upgrade.
  5. Click Actions > Bootstrap Solr Configuration.
  6. Start the Solr and dependent services. Click Actions > Start.
  7. Click Actions > Bootstrap Solr Collections.

Affected Versions: CDH 6.0.0

Fixed Versions: Cloudera Manager 6.0.1

Cloudera Issue: OPSAPS-47502

Package Installation of CDH Fails

When you install CDH with packages from a custom repository, ensure that the version of CDH you select for Select the version of CDH matches the version of CDH for the custom repository. Selecting the CDH version and specifying a custom repository are done during the Select Repository stage of installation.

If the versions do not match, installation fails.

Affected Versions: Cloudera Manager 6.x

Fixed Versions: N/A

Apache Issue: N/A

Cloudera Issue: OPSAPS-45703

Flume Kafka client incompatible changes in CDH 5.8

Due to the change of offset storage from ZooKeeper to Kafka in the CDH 5.8 Flume Kafka client, data might not be consumed by the Flume agents, or might be duplicated (if kafka.auto.offset.reset=smallest) during an upgrade to CDH 5.8.

Cloudera Issue: TSB-173

Uninstall CDH 5 Sqoop connectors for Teradata and Netezza before upgrading to CDH 6

Sqoop includes two connectors, one for Teradata and one for Netezza. The connectors are released in separate parcels and tarballs and can be installed in Cloudera Manager or manually. The versioning of the connectors takes the form <connector_version>c<major_cdh_version>. For example, 1.6c5 refers to the connector 1.6 for CDH 5. The manifest files do not prohibit installing the CDH 5 connectors on CDH 6, but they are not compatible with CDH 6.

If you have CDH 5 connectors installed, they will not be automatically upgraded during the CDH upgrade, and they are not compatible with CDH 6, so they should be uninstalled before the upgrade. Keeping the CDH 5 connectors will not cause the upgrade to fail, but instead will cause a failure to occur during Sqoop runtime. Cloudera will release the connectors for CDH 6 at a later time.

For more information about the Teradata and Netezza connectors, go to Cloudera Enterprise Connector Documentation and choose the connector and version to see the documentation for your connector.

Unsupported Sqoop options cause upgrade failures

New fail-fast checks for unsupported options were introduced in CDH 6. Users should check the jobs stored in their Sqoop metastore and remove all unsupported options. Some unsupported options were silently ignored in earlier CDH versions during upgrades, but in CDH 6, the same options fail instantly. See the following JIRAs in Apache Sqoop Incompatible Changes:

Generated Avro code from CDH 5 should be regenerated when upgrading

Changes in logical types cause code generated in Avro with CDH 6 to differ from code generated in Avro with CDH 5. This means that old generated code will not necessarily work in CDH 6. Cloudera recommends that users regenerate their generated Avro code when upgrading.

Upgrading Apache Parquet to CDH 6

Parquet packages and the project’s group ID were renamed, and some of the class methods were removed.

If you directly consumes the Parquet API instead of using Parquet through a CDH component, your need to update and recompile your code. See Parquet API Change for details of the changes.

Upgrade to CDH 5.13 or higher Requires Pre-installation of Spark 2.1 or Spark 2.2

If your cluster has Spark 2.0 or Spark 2.1 installed and you want to upgrade to CDH 5.13 or higher, you must first upgrade to Spark 2.1 release 2 or later before upgrading CDH. To install these versions of Spark, do the following before running the CDH Upgrade Wizard:

  1. Install the Custom Service Descriptor (CSD) file. See
  2. Download, distribute, and activate the Parcel for the version of Spark that you are installing:
    • Spark 2.1 release 2: The parcel name includes "cloudera2" in its name.
    • Spark 2.2 release 1: The parcel name includes "cloudera1" in its name.
    See Managing Parcels.

Affected versions: CDH 5.13.0 and higher

Cloudera Issue: CDH-56775

Sentry may require increased Java heap settings before upgrading CDH to 5.13

Before upgrading to CDH 5.13 or higher, you may need to increase the size of the Java heap for Sentry. A warning will be displayed during upgrade, but it its the user's responsibility to ensure this setting is adjusted properly before proceeding. See Performance Guidelines.

Affected versions: CDH 5.13 or higher

Cloudera Issue: OPSAPS-42541

Apache MapReduce Jobs May Fail During Rolling Upgrade to CDH 5.11.0 or CDH 5.11.1

In CDH 5.11, Cloudera introduced four new counters that are reported by MapReduce jobs. During a rolling upgrade from a cluster running CDH 5.10.x or lower to CDH 5.11.0 or CDH5.11.1, a MapReduce job with an application master running on a host running CDH 5.10.x or lower may launch a map or reduce task on one of the newly-upgraded CDH 5.11.0 or CDH 5.11.1 hosts. The new task will attempt to report the new counter values, which the old application master will not understand, causing an error in the logs similar to the following:
2017-06-08 17:43:37,173 WARN [Socket Reader #1 for port 41187]
org.apache.hadoop.ipc.Server: Unable to read call parameters for client 10.17.242.22on
connection protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind
RPC_WRITABLE
java.lang.ArrayIndexOutOfBoundsException: 23
   at
...

This error could cause the task and the job to fail.

Workaround:

Avoid performing a rolling upgrade to CDH 5.11.0 or CDH 5.11.1 from CDH 5.10.x or lower. Instead, skip CDH 5.11.0 and CDH 5.11.1 if you are performing a rolling upgrade, and upgrade to CDH 5.12 or higher, or CDH 5.11.2 or higher when the release becomes available.

Cloudera Issue: DOCS-2384, TSB-241

Cloudera Manager set catalogd default jvm memory to 4G can cause out of memory error on upgrade to Cloudera Manager 5.7 or higher

After upgrading to 5.7 or higher, you might see a reduced Java heap maximum on Impala Catalog Server due to a change in its default value. Upgrading from Cloudera Manager lower than 5.7 to Cloudera Manager 5.8.2 no longer causes any effective change in the Impala Catalog Server Java Heap size.

When upgrading from Cloudera Manager 5.7 or later to Cloudera Manager 5.8.2, if the Impala Catalog Server Java Heap Size is set at the default (4GB), it is automatically changed to either 1/4 of the physical RAM on that host, or 32GB, whichever is lower. This can result in a higher or a lower heap, which could cause additional resource contention or out of memory errors, respectively.

Cloudera Issue: OPSAPS-34039

Out-of-memory errors may occur with Oracle JDK 1.8

The total JVM memory footprint for JDK8 can be larger than that of JDK7 in some cases. This may result in out-of-memory errors.

Workaround: Increase max default heap size (-Xmx). In the case of MapReduce, for example, increase Reduce Task Maximum Heap Size in Cloudera Manager (mapred.reduce.child.java.opts, or mapreduce.reduce.java.opts for YARN) to avoid out-of-memory errors during the shuffle phase.