Known Issues and Workarounds in Cloudera Manager 5
The following sections describe the current known issues in Cloudera Manager 5.
HDFS HA clusters see NameNode failures when KDC connectivity is bad
When KDC connectivity is bad, the JVM takes 30 seconds before retrying or declaring failure to connect. Meanwhile, the JournalNode write timeout (which needs KDC authentication for the first write, or under troubled connectivity), is only 20 seconds.
Workaround: In krb5.conf, set the kdc_timeout parameter value to 3 seconds. In Cloudera Manager, you can do this by going to, and adding the kdc_timeout parameter to the Advanced Configuration Snippet (Safety Valve) for [libdefaults] section of krb5.conf property. This should give the JVM enough time to try connecting to a KDC before the JournalNode timeout.
Stale Kerberos configuration reported after deploying Kerberos client configuration
After making Kerberos configuration changes through Manage krb5.conf is enabled, the configuration issue 'Cluster has stale kerberos configuration' might display and might not disappear after running ., and
Workaround: Make a different, non-Kerberos edit to a configuration, and save that change. Revert that change immediately afterward.
The HDFS File browser in Cloudera Manager fails when HDFS federation is enabled
Workaround: Use the command-line hdfs dfs commands to directly manipulate HDFS files when federation is enabled. CDH supports HDFS federation.
Oozie, Solr, and HttpFS keystore passwords are presented in clear text
Because of Tomcat restrictions, the Oozie, Solr and HttpFS keystore passwords are sent as clear text on the machines running the services. They are hidden in the Cloudera Manager Admin Console.
Hive Metastore canary fails to drop database
Hive Metastore canary fails to drop database due to HIVE-11418.
- Go to the Hive service.
- Click the Configuration tab.
- Select .
- Select .
- Deselect the Hive Metastore Canary Health Test checkbox for the Hive Metastore Server Default Group.
- Click Save Changes to commit the changes.
Cloudera Manager upgrade fails due to incorrect Sqoop 2 path
- Workaround for Upgrading from Cloudera Manager 3 or 4 to
Cloudera Manager 5.4.0 or 5.4.1
- Log in to your Sqoop 2 server host using SSH and move the Derby database files to the new location, usually from /var/lib/sqoop2/repository to /var/lib/sqoop2/repositoy.
- Start Sqoop2. If you found this problem while upgrading CDH, run the Sqoop 2 database upgrade command using the Actions drop-down menu for Sqoop 2.
- Workaround for Upgrading from Cloudera Manager 5.4.0 or 5.4.1
to Cloudera Manager 5.4.3
- Log in to your Sqoop 2 server host using SSH and move the Derby database files to the new location, usually from/var/lib/sqoop2/repositoy to /var/lib/sqoop2/repository.
- Start Sqoop2, or if you found this problem while upgrading CDH, run the Sqoop 2 database upgrade command using the Actions drop-down menu for Sqoop 2.
NameNode incorrectly reports missing blocks during rolling upgrade
During a rolling upgrade to any of the CDH releases listed below, the NameNode may report missing blocks after rolling back multiple DataNodes. This is caused by a race condition with block reporting between the DataNode and the NameNode. No permanent data loss occurs, but data can be unavailable for up to six hours before the problem corrects itself.
Releases affected: CDH 5.0.6, 5.1.5, 5.2.5, 5.3.3, 5.4.1, 5.4.2.
Releases containing the fix:: CDH 5.3.4, 5.4.3
- To avoid the problem - Cloudera advises skipping the affected releases and installing a release containing the fix. For example, do not upgrade to CDH 5.4.2; upgrade to CDH 5.4.3 instead.
- If you have already completed an upgrade to an affected release, or are installing a new cluster - You can continue to run the release, or upgrade to a release that is not affected.
Using ext3 for server dirs easily hit inode limit
Using the ext3 filesystem for the Cloudera Manager command storage directory may exceed the maximum subdirectory size of 32000.
Workaround: Either decrease the value of the Command Eviction Age property so that the directories are more aggressively cleaned up, or migrate to the ext4 filesystem.
Backup and disaster recovery replication does not set MapReduce Java options
Replication used for backup and disaster recovery relies on system-wide MapReduce memory options, and you cannot configure the options using the Advanced Configuration Snippet.
Kafka 1.2 CSD conflicts with CSD included in Cloudera Manager 5.4
If the Kafka CSD was installed in Cloudera Manager to 5.3 or lower, the old version must be uninstalled, otherwise it will conflict with the version of the Kafka CSD bundled with Cloudera Manager 5.4.
- Determine the location of the CSD directory:
- Select .
- Click the Custom Service Descriptors category.
- Retrieve the directory from the Local Descriptor Repository Path property.
- Delete the Kafka CSD from the directory.
Recommission host doesn't deploy client configurations
The failure to deploy client configurations can result in client configuration pointing to the wrong locations, which can cause errors such as the NodeManager failing to start with "Failed to initialize container executor".
Workaround: Deploy client configurations first and then restart roles on the recommissioned host.
Hive on Spark is not supported in Cloudera Manager and CDH 5.4 and CDH 5.5
You can configure Hive on Spark, but it is not recommended for production clusters.
CDH 5 requires JDK 1.7
JDK 1.6 is not supported on any CDH 5 release, but before CDH 5.4.0, CDH libraries have been compatible with JDK 1.6. As of CDH 5.4.0, CDH libraries are no longer compatible with JDK 1.6 and applications using CDH libraries must use JDK 1.7.
Upgrade wizard incorrectly upgrades the Sentry DB
There's no Sentry DB upgrade in 5.4, but the upgrade wizard says there is. Performing the upgrade command is not harmful, and taking the backup is also not harmful, but the steps are unnecessary.
Cloudera Manager doesn't correctly generate client configurations for services deployed using CSDs
HiveServer2 requires a Spark on YARN gateway on the same host in order for Hive on Spark to work. You must deploy Spark client configurations whenever there's a change in order for HiveServer2 to pick up the change.
CSDs that depend on Spark will get incomplete Spark client configuration. Note that Cloudera Manager does not ship with any such CSDs by default.
Workaround: Use /etc/spark/conf for Spark configuration, and ensure there is a Spark on YARN gateway on that host.
Solr, Oozie and HttpFS fail when KMS and TLS/SSL are enabled using self-signed certificates
org.apache.oozie.service.AuthorizationException: E0501: Could not perform authorization operation, sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Workaround: You must explicitly load the relevant truststores with the KMS certificate to allow these services to communicate with the KMS. To do so, edit the truststore location and password for Solr, Oozie and HttpFS (found under the HDFS service) as follows.
- Go to the Cloudera Manager Admin Console.
- Navigate to the Solr/Oozie/HDFS service.
- Click the Configuration tab.
- Search for "<service> TLS/SSL Certificate Trust Store File" and set this property to the location of truststore file.
- Search for "<service> TLS/SSL Certificate Trust Store Password" and set this property to the password of the truststore.
- Click Save Changes to commit the changes.
Cloudera Manager 5.3.1 upgrade fails if Spark standalone and Kerberos are configured
CDH upgrade fails if Kerberos is enabled and Spark standalone is installed. Spark standalone doesn't work in a kerberized cluster.
Workaround: To upgrade, remove the Spark standalone service first and then proceed with upgrade.
KMS and Key Trustee ACLs do not work in Cloudera Manager 5.3
ACLs configured for the KMS (File) and KMS (Navigator Key Trustee) services do not work since these services do not receive the values for hadoop.security.group.mapping and related group mapping configuration properties.
KMS (File): Add all configuration properties starting with hadoop.security.group.mapping from the NameNode core-site.xml to the KMS (File) property, Key Management Server Advanced Configuration Snippet (Safety Valve) for core-site.xml
KMS (Navigator Key Trustee): Add all configuration properties starting with hadoop.security.group.mapping from the NameNode core-site.xml to the KMS (Navigator Key Trustee) property, Key Management Server Proxy Advanced Configuration Snippet (Safety Valve) for core-site.xml.
Exporting and importing Hue database sometimes times out after 90 seconds
Executing 'dump database' or 'load database' of Hue from Cloudera Manager returns "command aborted because of exception: Command timed-out after 90 seconds". The Hue database can be exported to JSON from within Cloudera Manager. Unfortunately, sometimes the Hue database is quite large and the export times out after 90 seconds.
Workaround: Ignore the timeout. The command should eventually succeed even though Cloudera Manager reports that it timed out.
Changing hostname of key trustee server requires editing the keytrustee.conf file
If you change the hostname of your primary or backup server, you will need to edit your keytrustee.conf file. This issue typically arises if you replace a primary or backup server with a server having a different hostname. If the same hostname is used on the new server, there will be no issues.
Workaround: Use the same hostname on the replacement server.
Hosts with Impala Llama roles must also have at least one YARN role
"Exception running /etc/hadoop/conf.cloudera.yarn/topology.py java.io.IOException: Cannot run program "/etc/hadoop/conf.cloudera.yarn/topology.py"in the Llama role logs, and Impala queries may fail.
Workaround: Add a YARN gateway role to each Llama host that does not already have at least one YARN role (of any type).
The high availability wizard does not verify that there is a running ZooKeeper service
- 1. ZooKeeper present and not running and the HDFS dependency on ZooKeeper dependency is not set
- 2. ZooKeeper absent
- Create and start a ZooKeeper service if one doesn't exist.
- Go to the HDFS service.
- Click the Configuration tab.
- Set the ZooKeeper Service property to the ZooKeeper service.
- Click Save Changes to commit the changes.
Cloudera Manager Installation Path A fails on RHEL 5.7 due to PostgreSQL conflict
On RHEL 5.7, cloudera-manager-installer.bin fails due to a PostgreSQL conflict if PostgreSQL 8.1 is already installed on your host.
Workaround: Remove PostgreSQL from host and rerun cloudera-manager-installer.bin.
Spurious warning on Accumulo 1.6 gateway hosts
When using the Accumulo shell on a host with only an Accumulo 1.6 Service gateway role, users will receive a warning about failing to create the directory /var/log/accumulo. The shell works normally otherwise.
Workaround: The warning is safe to ignore.
Accumulo 1.6 service log aggregation and search does not work
Cloudera Manager log aggregation and search features are incompatible with the log formatting needed by the Accumulo Monitor. Attempting to use either the "Log Search" diagnostics feature or the log file link off of an individual service role's summary page will result in empty search results.
Workaround: Operators can use the Accumulo Monitor to see recent severe log messages. They can see recent log messages below the WARNING level via a given role's process page and can inspect full logs on individual hosts by looking in /var/log/accumulo.
Cloudera Manager incorrectly sizes Accumulo Tablet Server max heap size after 1.4.4-cdh4.5.0 to 1.6.0-cdh4.6.0 upgrade
Because the upgrade path from Accumulo 1.4.4-cdh4.5.0 to 1.6.0-cdh4.6.0 involves having both services installed simultaneously, Cloudera Manager will be under the impression that worker hosts in the cluster are oversubscribed on memory and attempt to downsize the max heap size allowed for 1.6.0-cdh4.6.0 Tablet Servers.
Workaround: Manually verify that the Accumulo 1.6.0-cdh4.6.0 Tablet Server max heap size is large enough for your needs. Cloudera recommends you set this value to the sum of 1.4.4-cdh4.5.0 Tablet Server and Logger heap sizes.
Accumulo installations using LZO do not indicate dependence on the GPL Extras parcel
Accumulo 1.6 installations that use LZO compression functionality do not indicate that LZO depends on the GPL Extras parcel. When Accumulo is configured to use LZO, Cloudera Manager has no way to track that the Accumulo service now relies on the GPL Extras parcel. This prevents Cloudera Manager from warning administrators before they remove the parcel while Accumulo still requires it for proper operation.
Workaround: Check your Accumulo 1.6 service for the configuration changes mentioned in the Cloudera documentation for using Accumulo with CDH prior to removing the GPL Extras parcel. If the parcel is mistakenly removed, reinstall it and restart the Accumulo 1.6 service.
Created pools are not preserved when Dynamic Resource Pools page is used to configure YARN or Impala
Pools created on demand are not preserved when changes are made using the Dynamic Resource Pools page. If the Dynamic Resource Pools page is used to configure YARN and/or Impala services in a cluster, it is possible to specify pool placement rules that create a pool if one does not already exist. If changes are made to the configuration using this page, pools created as a result of such rules are not preserved across the configuration change.
Workaround: Submit the YARN application or Impala query as before, and the pool will be created on demand once again.
User should be prompted to add the AMON role when adding MapReduce to a CDH 5 cluster
When the MapReduce service is added to a CDH 5 cluster, the user is not asked to add the AMON role. Then, an error displays when the user tries to view MapReduce activities.
Workaround: Manually add the AMON role after adding the MapReduce service.
Enterprise license expiration alert not displayed until Cloudera Manager Server is restarted
When an enterprise license expires, the expiration notification banner is not displayed until the Cloudera Manager Server has been restarted. The enterprise features of Cloudera Manager are not affected by an expired license.
Configurations for decommissioned roles not migrated from MapReduce to YARN
When the Import MapReduce Configuration wizard is used to import MapReduce configurations to YARN, decommissioned roles in the MapReduce service do not cause the corresponding imported roles to be marked as decommissioned in YARN.
Workaround: Delete or decommission the roles in YARN after running the import.
The HDFS command Roll Edits does not work in the UI when HDFS is federated
The HDFS command Roll Edits does not work in the Cloudera Manager UI when HDFS is federated because the command doesn't know which nameservice to use.
Workaround: Use the API, not the Cloudera Manager UI, to execute the Roll Edits command.
Cloudera Manager reports a confusing version number if you have oozie-client, but not oozie installed on a CDH 4.4 node
In CDH versions before 4.4, the metadata identifying Oozie was placed in the client, rather than the server package. Consequently, if the client package is not installed, but the server is, Cloudera Manager will report Oozie has been present but as coming from CDH 3 instead of CDH 4.
Workaround: Either install the oozie-client package, or upgrade to at least CDH 4.4. Parcel based installations are unaffected.
Cloudera Manager doesn't work with CDH 5.0.0 Beta 1
When you upgrade from Cloudera Manager 5.0.0 Beta 1 with CDH 5.0.0 Beta 1 to Cloudera Manager 5.0.0 Beta 2, Cloudera Manager won't work with CDH 5.0.0 Beta 1 and there's no notification of that fact.
Workaround: None. Do a new installation of CDH 5.0.0 Beta 2.
On CDH 4.1 secure clusters managed by Cloudera Manager 4.8.1 and higher, the Impala Catalog server needs advanced configuration snippet update
Impala queries fail on CDH 4.1 when Hive "Bypass Hive Metastore Server" option is selected.
Workaround: Add the following to Impala catalog server advanced configuration snippet for hive-site.xml, replacing Hive_Metastore_Server_Host with the host name of your Hive Metastore Server:
<property> <name>hive.metastore.local</name> <value>false</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://Hive_Metastore_Server_Host:9083</value> </property>
Rolling Upgrade to CDH 5 is not supported.
Rolling upgrade between CDH 4 and CDH 5 is not supported. Incompatibilities between major versions means rolling restarts are not possible. In addition, rolling upgrade will not be supported from CDH 5.0.0 Beta 1 to any later releases, and may not be supported between any future beta versions of CDH 5 and the General Availability release of CDH 5.
Error reading .zip file created with the Collect Diagnostic Data command.
After collecting Diagnostic Data and using the Download Diagnostic Data button to download the created zip file to the local system, the zip file cannot be opened using the FireFox browser on a Macintosh. This is because the zip file is created as a Zip64 file, and the unzip utility included with Macs does not support Zip64. The zip utility must be version 6.0 or later. You can determine the zip version with unzip -v.
Workaround: Update the unzip utility to a version that supports Zip64.
After JobTracker failover, complete jobs from the previous active JobTracker are not visible.
When a JobTracker failover occurs and a new JobTracker becomes active, the new JobTracker UI does not show the completed jobs from the previously active JobTracker (that is now the standby JobTracker). For these jobs the "Job Details" link does not work.
After JobTracker failover, information about rerun jobs is not updated in Activity Monitor.
When a JobTracker failover occurs while there are running jobs, jobs are restarted by the new active JobTracker by default. For the restarted jobs the Activity Monitor will not update the following: 1) The start time of the restarted job will remain the start time of the original job. 2) Any Map or Reduce task that had finished before the failure happened will not be updated with information about the corresponding task that was rerun by the new active JobTracker.
Installing on AWS, you must use private EC2 hostnames.
When installing on an AWS instance, and adding hosts using their public names, the installation will fail when the hosts fail to heartbeat.
Use the Back button in the wizard to return to the original screen, where it prompts for a license.
Rerun the wizard, but choose "Use existing hosts" instead of searching for hosts. Now those hosts show up with their internal EC2 names.
Continue through the wizard and the installation should succeed.
If HDFS uses Quorum-based Storage without HA enabled, the SecondaryNameNode cannot checkpoint.
If HDFS is set up in non-HA mode, but with Quorum-based storage configured, the dfs.namenode.edits.dir is automatically configured to the Quorum-based Storage URI. However, the SecondaryNameNode cannot currently read the edits from a Quorum-based Storage URI, and will be unable to do a checkpoint.
Workaround: Add to the NameNode's advanced configuration snippet the dfs.namenode.edits.dir property with both the value of the Quorum-based Storage URI as well as a local directory, and restart the NameNode. For example,
<property> <name>dfs.namenode.edits.dir</name> <value>qjournal://jn1HostName:8485;jn2HostName:8485;jn3HostName:8485/journalhdfs1,file:///dfs/edits</value> </property>
Changing the rack configuration may temporarily cause mis-replicated blocks to be reported.
A rack re-configuration will cause HDFS to report mis-replicated blocks until HDFS rebalances the system, which may take some time. This is a normal side-effect of changing the configuration.
Cannot use '/' as a mount point with a Federated HDFS Nameservice.
A Federated HDFS Service doesn't support nested mount points, so it is impossible to mount anything at '/'. Because of this issue, the root directory will always be read-only, and any client application that requires a writeable root directory will fail.
- In the CDH 4 HDFS Service > Configuration tab of the Cloudera Manager Admin Console, search for "nameservice".
- In the Mountpoints field, change the mount point from "/" to a list of mount points that are in the namespace that the Nameservice will manage. (You can enter this as a comma-separated list - for example, "/hbase, /tmp, /user" or by clicking the plus icon to add each mount point in its own field.) You can determine the list of mount points by running the command hadoop fs -ls / from the CLI on the NameNode host.
Historical disk usage reports do not work with federated HDFS.
(CDH 4 only) Activity monitoring does not work on YARN activities.
Activity monitoring is not supported for YARN in CDH 4.
HDFS monitoring configuration applies to all Nameservices
The monitoring configurations at the HDFS level apply to all Nameservices. So, if there are two federated Nameservices, it's not possible to disable a check on one but not the other. Likewise, it's not possible to have different thresholds for the two Nameservices.
Supported and Unsupported Replication Scenarios and Limitations
See Data Replication.
Restoring snapshot of a file to an empty directory does not overwrite the directory
Restoring the snapshot of an HDFS file to an HDFS path that is an empty HDFS directory (using the Restore As action) will result in the restored file present inside the HDFS directory instead of overwriting the empty HDFS directory.
HDFS Snapshot appears to fail if policy specifies duplicate directories.
In an HDFS snapshot policy, if a directory is specified more than once, the snapshot appears to fail with an error message on the Snapshot page. However, in the HDFS Browser, the snapshot is shown as having been created successfully.
Workaround: Remove the duplicate directory specification from the policy.
Hive replication fails if "Force Overwrite" is not set.
The Force Overwrite option, if checked, forces overwriting data in the target metastore if there are incompatible changes detected. For example, if the target metastore was modified and a new partition was added to a table, this option would force deletion of that partition, overwriting the table with the version found on the source. If the Force Overwrite option is not set, recurring replications may fail.
Workaround: Set the Force Overwrite option.
|<< New Features and Changes in Cloudera Manager 5||Issues Fixed in Cloudera Manager 5 >>|