Install and Upgrade Known Issues

Flume Kafka client incompatible changes in CDH5.8

Due to the change of offset storage from ZooKeeper to Kafka in the CDH5.8 Flume Kafka client, data might not be consumed by the Flume agents, or might be duplicated (if kafka.auto.offset.reset=smallest) during an upgrade to CDH5.8.

Bug: TSB-173

Workaround: See Upgrading to CDH 5.8 When Using the Flume Kafka Client

Upgrades to CDH 5.4.1 from Releases Earlier than 5.4.0 May Fail

Problem: Because of a change in the implementation of the NameNode metadata upgrade mechanism, upgrading to CDH 5.4.1 from a version lower than 5.4.0 can take an inordinately long time. In a cluster with NameNode high availability (HA) configured and a large number of edit logs, the upgrade can fail, with errors indicating a timeout in the pre-upgrade step on JournalNodes.

What to do:

To avoid the problem: Do not upgrade to CDH 5.4.1; upgrade to CDH 5.4.2 instead.

If you experience the problem: If you have already started an upgrade and seen it fail, contact Cloudera Support. This problem involves no risk of data loss, and manual recovery is possible.

If you have already completed an upgrade to CDH 5.4.1, or are installing a new cluster: In this case you are not affected and can continue to run CDH 5.4.1.

Potential job failures during YARN rolling upgrades to CDH 5.3.4

Problem: A MapReduce security fix introduced a compatibility issue that results in job failures during YARN rolling upgrades from CDH 5.3.3 to CDH 5.3.4.

Release affected: CDH 5.3.4

Release containing the fix: CDH 5.3.5

Workarounds: You can use any one of the following workarounds for this issue:
  • Upgrade to CDH 5.3.5.
  • Restart any jobs that might have failed during the upgrade.
  • Explicitly set the version of MapReduce to be used so it is picked on a per-job basis.
    1. Update the YARN property, MR Application Classpath (mapreduce.application.classpath), either in Cloudera Manager or in the mapred-site.xml file. Remove all existing values and add a new entry: <parcel-path>/lib/hadoop-mapreduce/*, where <parcel-path> is the absolute path to the parcel installation. For example, the default installation path for the CDH 5.3.3 parcel would be: /opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/lib/hadoop-mapreduce/*.
    2. Wait until jobs submitted with the above client configuration change have run to completion.
    3. Upgrade to CDH 5.3.4.
    4. Update the MR Application Classpath (mapreduce.application.classpath) property to point to the new CDH 5.3.4 parcel.

      Do not delete the old parcel until after all jobs submitted prior to the upgrade have finished running.

NameNode Incorrectly Reports Missing Blocks During Rolling Upgrade

Problem: During a rolling upgrade to any of the releases listed below, the NameNode may report missing blocks after rolling back multiple DataNodes. This is caused by a race condition with block reporting between the DataNode and the NameNode. No permanent data loss occurs, but data can be unavailable for up to six hours before the problem corrects itself.

Releases affected: CDH 5.0.6, 5.1.5, 5.2.5, 5.3.3, 5.4.1, 5.4.2

What to do:

To avoid the problem: Cloudera advises skipping the affected releases and installing a release containing the fix. For example, do not upgrade to CDH 5.4.2; upgrade to CDH 5.4.3 instead.

The releases containing the fix are: CDH 5.3.4, 5.4.3

If you have already completed an upgrade to an affected release, or are installing a new cluster: You can continue to run the release, or upgrade to a release that is not affected.

No in-place upgrade to CDH 5 from CDH 4

Cloudera fully supports upgrade from Cloudera Enterprise 4 and CDH 4 to Cloudera Enterprise 5. Upgrade requires uninstalling the CDH 4 packages before installing CDH 5 packages. See the CDH 5 upgrade documentation for instructions.

CDH 4 and Cloudera Manager 4 End of Maintenance

Cloudera Manager version 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera Manager version 4 or CDH 4 releases after that date.

Upgrading to CDH 5.4 or later requires an HDFS upgrade

Upgrading to CDH 5.4.0 or later from an earlier CDH 5 release requires an HDFS upgrade, and upgrading from a release earlier than CDH 5.2.0 requires additional steps. See Upgrading from an Earlier CDH 5 Release to the Latest Release for further information. See also What's New In CDH 5.4.x.

Upgrading from CDH 4 requires an HDFS upgrade

Upgrading from CDH 4 requires an HDFS upgrade. See Upgrading from CDH 4 to CDH 5 for further information. See also What's New In CDH 5.4.x.

JDK Does Not Have Up-to-date tzdata

The Europe/Moscow time zone has changed. This time zone change is not incorporated into JDK lower than 8u31 and 7u75. Cloudera ships JDK7u67, which does not reflect these tzdata updates. If you run UDF related to time zone conversion, you receive incorrect results.

Workaround: Upgrade to JDK version 8u31 or 7u75.

CDH 5 requires JDK 1.7

JDK 1.6 is not supported on any CDH 5 release, but before CDH 5.4.0, CDH libraries have been compatible with JDK 1.6. As of CDH 5.4.0, CDH libraries are no longer compatible with JDK 1.6 and applications using CDH libraries must use JDK 1.7.

In addition, you must upgrade your cluster to a supported version of JDK 1.7 before upgrading to CDH 5. See Upgrading to Oracle JDK 1.7 before Upgrading to CDH 5 for instructions.

Extra step needed on Ubuntu Trusty if you add the Cloudera repository

If you install or upgrade CDH on Ubuntu Trusty using the command line, and add the Cloudera repository yourself (rather than using the "1-click Install" method) you need to perform an additional step to ensure that you get the CDH version of ZooKeeper, rather than the version that is bundled with Trusty. See Steps to Install CDH 5 Manually.

No upgrade directly from CDH 3 to CDH 5

You must upgrade to CDH 4, then to CDH 5. See the CDH 4 documentation for instructions on upgrading from CDH 3 to CDH 4.

Upgrading hadoop-kms from 5.2.x and 5.3.x releases fails on SLES

Upgrading hadoop-kms fails on SLES when you try to upgrade an existing version from 5.2.x releases earlier than 5.2.4, and from 5.3.x releases earlier than 5.3.2. For details and troubleshooting instructions, see Troubleshooting: Upgrading hadoop-kms from 5.2.x and 5.3.x Releases on SLES.

After upgrading from a release earlier than CDH 4.6, you may see reports of corrupted files

Some older versions of CDH do not handle DataNodes with a large number of blocks correctly. The problem exists on versions 4.6, 4.7, 4.8, 5.0, and 5.1. The symptom is that the NameNode Web UI and the fsck command incorrectly report missing blocks, even when those blocks are present.

The cause of the problem is that if the DataNode attempts to send a block report that is larger than the maximum RPC buffer size, the NameNode rejects the report. This prevents the NameNode from becoming aware of the blocks on the affected DataNodes. The maximum buffer size is controlled by the ipc.maximum.data.length property, which defaults to 64 MB.

This problem does not exist in CDH 4.5 and earlier because there is no maximum RPC buffer size in these versions. Starting in CDH5.2, DataNodes now send individual block reports for each storage volume, which mitigates the problem.

Bug: HADOOP-9676

Severity: Medium

Workaround: Immediately after upgrading, increase the value of ipc.maximum.data.length; Cloudera recommends doubling the default value, from 64 MB to 128 MB:
<property>
  <name>ipc.maximum.data.length</name>
  <value>134217728</value> 
</property> 
  • In a Cloudera Manager installation, set this property in the hdfs_service_config_safety_valve.
  • In a command-line-only installation, add and set this property in core-site.xml.

After setting ipc.maximum.data.length, restart the NameNode(s).

Must build native libraries when installing from tarballs

When installing Hadoop from Cloudera tarballs, you must build your own native libraries. The tarballs do not include libraries that are built for the different distributions and architectures.