Known Issues and Work Arounds in Cloudera Manager 4

The following sections describe the current known issues and fixed issues in each Cloudera Manager release.

Known Issues in the Current Release

— Beeswax does not respect the Java Heap Size.

As of Cloudera Manager 4.5.1, the Beeswax setting for the max heap size is getting overridden by the value set in hadoop-env.sh, ignoring the Beeswax-specific setting.

Severity: Medium

Anticipated Resolution: To be fixed in a future release.

Workaround: In the Hue Service, configure the Hue Service Environment Safety Valve (under the Configuration Tab > View and Edit > Service-Wide > Advanced) with HADOOP_CLIENT_OPTS=-Xmx1024m (or whatever the desired memory).

— Enabling wildcarding in a secure environment causes NameNode to fail to start.

In a secure cluster, you cannot use a wildcard for the NameNode's RPC or HTTP bind address, or the NameNode will fail to start. For example, dfs.namenode.http-address must be a real, routable address and port, not 0.0.0.0.<port>. In Cloudera Manager, the "Bind NameNode to Wildcard Address" property must not be enabled. This should affect you only if you are running a secure cluster and your NameNode needs to bind to multiple local addresses.

Bug: HDFS-4448

Severity: Medium

Anticipated Resolution: To be fixed in a future CDH release.

Workaround: Disable the "Bind NameNode to Wildcard Address" property found under the Configuration tab for the NameNode role group.

— Upgrade from 4.6.0 to 4.6.1 with HA enabled may cause HDFS restarts/failovers to fail.

When upgrading from an installation of Cloudera Manager 4.6.0 to 4.6.1 with HDFS High Availability enabled, you must set the value of the NameNode Service RPC port (dfs.namenode.servicerpc-address) to 8022, or else HDFS failover or restart will fail. Restart the HDFS service after you have changed the property value.

  Note: This workaround applies only if you installed Cloudera Manager 4.6.0 as a new installation. If you upgraded to Cloudera Manager 4.6.0 from Cloudera Manager 4.5, you should not need to do this.

Severity: High

Anticipated Resolution: None.

Workaround: To set the value of the NameNode Service RPC port:
  1. Go to the HDFS service, Configuration tab, View and Edit.
  2. Type servicerpc in the search field to find the property.
  3. Change its value to 8022 and Save Changes.
  4. Restart the HDFS service.

— Errors when using Hive Server2 with CDH4.1.

Cloudera Manager 4.5 or later supports Hive Server2 with CDH4.2 only. While it is possible to add the Hive Server2 role in Cloudera Manager when using CDH4.1, you may experience errors such as missing log files or other problems.

Severity: Med

Anticipated Resolution: Fixed in CDH4.2. No plans to fix for CDH4.1.

Workaround: Upgrade to CDH4.2, or run Hive Server2 outside of Cloudera Manager.

— Federation setup workflow may result in failure of NameNode format step.

When adding a Nameservice using the "Add Nameservice" workflow (and not choosing the "Enable NFS High Availability" option) to an HDFS service that has a Nameservice configured to use JournalNodes, the NameNode formatting step will fail because the new Nameservice is incorrectly configured to use the same journal name as the existing Nameservice.

Severity: High

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: Configure the new NameNodes (via the safety valve) with a QuorumJournal URL that has a different journal name from the original Nameservice, and then manually perform the rest of the steps in the "Add Nameservice" workflow.

— Impala log file is not rolling over per the max log size setting.

Impala logging uses two loggers -- GLog and log4j -- to perform logging into a single log file. GLog correctly rolls its logging to a new file per the Impala Daemon Max Log Size property, but log4j ignores that setting and continues to log into the original log file.

Severity: Med

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: None.

— After JobTracker failover, complete jobs from the previous active JobTracker are not visible.

When a JobTracker failover occurs and a new JobTracker becomes active, the new JobTracker UI does not show the completed jobs from the previously active JobTracker (that is now the standby JobTracker). For these jobs the "Job Details" link does not work.

Severity: Med

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: None.

— After JobTracker failover, information about rerun jobs is not updated in Activity Monitor.

When a JobTracker failover occurs while there are running jobs, jobs are restarted by the new active JobTracker by default. For the restarted jobs the Activity Monitor will not update the following: 1) The start time of the restarted job will remain the start time of the original job. 2) Any Map or Reduce task that had finished before the failure happened will not be updated with information about the corresponding task that was rerun by the new active JobTracker.

Severity: Med

Anticipated Resolution: None.

Workaround: None.

— Impala Query Monitor shows queries as running even when they have finished.

Due to a problem in Hue, queries issued from the Impala query application in Hue will appear as running in Cloudera Manager's Impala Query Monitor and as Active in the Impala Daemon web UI even after they have finished and have been marked "expired" in Hue.

Severity: Med

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: None.

— (BDR) Cannot add a Peer cluster that is running Cloudera Manager Free Edition.

Replication is not supported with Cloudera Manager Free Edition (or Cloudera Standard), so attempting to add a as a peer a cluster managed by a Free Edition Cloudera Manager server will fail. As of Cloudera Manager 4.6, the Add Peer function will succeed, but this is not a supported configuration.

Severity: Med

Anticipated Resolution: None.

Workaround: None.

— Secure bulk loading in HBase fails after upgrade if no coprocessors are configured.

In order to perform an upgrade of HBase to CDH 4.3 in a secure cluster, the org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint entry is required in the HBase Coprocessor Region Classes configuration property. By default Cloudera Manager leaves this property empty. If you do not configure this property, secure bulk loading jobs will fail after the upgrade to CDH 4.3.

Severity: Med

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: Add org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint to the HBase Coprocessor Region Classes property in every RegionServer Role Group. Then re-deploy the client configuration before upgrading.

— WebHCat role logs cannot be written in CDH4.2.

When using WebHCat with default configuration on CDH4.2, role logs cannot be written due to permission error on /var/log/hcatalog because it is owned by user hcatalog, not user hive

Severity: Low

Resolution: Fixed in CDH4.3, where /var/log/hcatalog is owned by the hive user by default.

Workaround: chown /var/log/hcatalog to the process user used by Hive Service, which is hive by default. Alternatively, change the webhcat log directory.

— Impala Queries fail when "Bypass Hive Metastore Server" option is selected.

Impala queries fail on CDH4.1 when Hive "Bypass Hive Metastore Server" option is selected. You can work around this by using the Impala Safety Valve for hive-site.xml, replacing <hive_metastore_server_host> with the name of your Hive metastore server host.

Severity: Med

Anticipated Resolution: Fixed in CDH4.2. No plans to fix for CDH4.1.

Workaround: See the detailed instructions for the safety valve configuration in Installing Impala with Cloudera Manager.

— Hive Table Stats configuration recommended for optimal performance.

Configuring Hive Table Stats is highly recommended when using Impala. It allows Impala to make optimizations that can result in significant (over 10x) performance improvements for some joins. If these are not available, Impala will still function, but at lower performance.

Severity: Med

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: See Installing Impala with Cloudera Manager in the Cloudera Manager Installation Guide for information on configuring Hive Table Stats.

— Health Check for Navigator and Reports appears in the API results even if those roles are not configured.

The Cloudera Manager Navigator health check appears as "Not Available" in the Cloudera Manager API health results for the MGMT service, even if no Navigator role is configured. The same is true of the Reports Manager role. This can occur if you are running the Cloudera Standard version of Cloudera Manager. This can be safely ignored and may be removed in a future release.

Severity: Medium

Anticipated Resolution: To be fixed in an upcoming release.

Workaround: None.

— During HDFS replication, tasks may fail due to DataNode timeouts.

In CDH4.2, during an HDFS replication job (using Cloudera Manager's Backup and Data Recovery product) individual tasks in the Replication job may fail due to DataNode timeouts. If enough of these timeouts occur, the replication task may slow down, and the entire replication job could time out and fail.

Severity: Medium

Anticipated Resolution: To be fixed in an upcoming CDH release.

Workaround: None.

— Upgrading a secure CDH3 cluster to CDH4 fails due to missing HTTP principal in NameNode's keytab.

If you have set up a secure CDH3 cluster using a Cloudera Manager version before 4.5, upgrading the cluster to CDH4 will fail because the NameNode's hdfs.keytab file does not contain the HTTP principal that is required in CDH4 HDFS.

If using a custom keytab generating script with Cloudera Manager, the script should be modified to include the HTTP principal for CDH3 NameNodes to enable an upgrade to CDH4.

Severity: High if you used a pre-4.5 CM to set up a secure CDH3 cluster and want to upgrade it to CDH4. Otherwise N/A.

Workaround:

  1. Upgrade to Cloudera Manager 4.5 or later.
  2. From the Administration menu, select Kerberos.
  3. Select the NameNode's credentials and press the Regenerate button. This will cause the HTTP principal to be included in the NameNode's hdfs.keytab.

Note that if you set up a secure CDH3 cluster using Cloudera Manager 4.5, this workaround is not necessary and the bug does not manifest.

— Java 6 GC bug leads to a memory leak.

Java 6 has a bug with finalizers that leads to a memory leak when using -XX:+ConcMarkSweepGC. This bug is fixed in Java6u32. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034. To work around this JVM bug, Cloudera Manager configures processes with both -XX:+ConcMarkSweepGC and -XX:-CMSConcurrentMTEnabled. This workaround has a slight performance penalty.

Severity: Med

Anticipated Resolution: None in Cloudera Manager; fixed in Java6u32.

Workaround: As described above. If you have a JVM that does not exhibit this bug, you can remove -XX:-CMSConcurrentMTEnabled by configuring the JVM arguments for your services.

— After upgrade from 4.5 beta to 4.5 GA, Hive service lacks a mandatory dependency on a MapReduce service.

After upgrading from 4.5 beta to 4.5 GA, the Hive service lacks a mandatory dependency on a MapReduce service.

Severity: Low

Anticipated Resolution: None.

Workaround: Navigate to the Hive service's configuration page (and dismiss any error popups in the meantime), and set the "MapReduce Service" configuration.

— After upgrade to Cloudera Manager 4.5, the new Hive Metastore Server fails to start if Hue/Beeswax uses a Derby metastore.

When upgrading to Cloudera Manager 4.5, it creates a new Hive service to capture the Hive dependency of an existing Hue service. If Hue/Beeswax uses a Derby metastore, Hue will keep working, but the new Hive Metastore Server will fail to start because a Derby metastore cannot be shared between multiple services. This is harmless. But you should consider migrating away from a Derby metastore.

Severity: Low

Anticipated Resolution: None. This is the intended behavior.

Workaround: None.

— On a Cloudera Manager managed cluster, the NameNode doesn't listen on loopback by default.

By default, wildcards are disabled. To have a role (for example, NameNode) listen on loopback, enable wildcards for that role.

Severity: Low

Anticipated Resolution: None.

Workaround: To have a role listen on loopback (for example NameNode) enable wildcards for that role. This can be done from the Configuration tab for all HDFS roles and for the JobTracker.

— If HDFS NameNode is configured to bind to a wildcard address, some Hue applications won't work.

If HDFS NameNode is configured to bind to a wildcard address using the property "Bind NameNode to Wildcard Address," certain Hue applications will not work. The Oozie, JobDesigner, FileBrowser, and Pig Shell applications fail if the NameNode is configured to bind to a wildcard address.

Severity: Low

Anticipated Resolution: To be fixed in a future CDH release.

Workaround: Disable NameNode's configuration to bind to wildcard address.

— Installing on AWS, you must use private EC2 hostnames.

When installing on an AWS instance, and adding hosts using their public names, the installation will fail when the hosts fail to heartbeat.

Severity: Med

Anticipated Resolution: To be fixed in a future release.

Workaround:

Use the Back button in the wizard to return to the original screen, where it prompts for a license.

Rerun the wizard, but choose "Use existing hosts" instead of searching for hosts. Now those hosts show up with their internal EC2 names.

Continue through the wizard and the installation should succeed.

— After removing and then re-adding a service, the alternatives settings are incorrect.

After deleting a Cloudera Manager service, the alternatives settings are not cleaned up. If you then re-add the service, it will be given a new instance name, and a new set of configurations settings are added. However, because both the new and old (deleted) instances have the same alternatives priority, the original one will be used rather than the newer one.

Severity: Med

Anticipated Resolution: To be fixed in a future release.

Workaround: The simplest way to fix this is:

  1. Go to the Configuration tab for the new service instance in Cloudera Manager
  2. Search for "alternatives"
  3. Raise the priority value and Save your setting.
  4. Redeploy your client configuration (from the Actions menu).

— New schema extensions have been introduced for Oozie in CDH4.1

In CDH4.1, Oozie introduced new versions for Hive, Sqoop and workflow schemas. To use them, you must add the new schema extensions to the Oozie SchemaService Workflow Extension Schemas configuration property in Cloudera Manager.

Severity: Low

Anticipated Resolution: To be fixed in a future release.

Workaround: In Cloudera Manager, do the following:

  1. Go to the CDH4 Oozie service page.
  2. Go to the Configuration tab.
  3. Select the Oozie Server category.
  4. Add the following to the Oozie SchemaService Workflow Extension Schemas property:
    shell-action-0.2.xsd hive-action-0.3.xsd sqoop-action-0.3.xsd
  5. Save these changes.

— Stop dependent HBase services before enabling HDFS Automatic Failover.

When enabling HDFS Automatic Failover, you need to first stop any dependent HBase services. The Automatic Failover configuration workflow restarts both NameNodes, which could cause HBase to become unavailable.

Severity: Medium

Anticipated Resolution: To be fixed in a future release.

— On Ubuntu 10.04, the Cloudera Manager agent will not run with an upgraded system python.

On Ubuntu 10.04, the Cloudera Manager agent will not run if the system python is upgraded to 2.6.5-1ubuntu6.1. (2.6.5-1ubuntu6 works correctly.) If you have upgraded, you must also rebuild your pre-prepared virtualenv.

Severity: Medium

Anticipated Resolution: None.

Workaround: Run the following commands:

# apt-get install python-virtualenv # virtualenv /usr/lib64/cmf/agent/build/env

— Cloudera Manager does not support encrypted shuffle.

Encrypted shuffle has been introduced in CDH4.1, but it is not currently possible to enable it through Cloudera Manager.

Severity: Medium

Anticipated Resolution: To be fixed in a future release.

Workaround: None.

— Enabling or disabling High Availability requires Hive Metastore modifications.

Enabling or disabling High Availability for HDFS NameNode requires the Hive Metastore to be modified. This is necessary if the cluster consists of services that depend on Hive, such as Impala and Hue. To modify the Hive Metastore before proceeding with Enabling or Disabling HDFS High Availability, see the Known Issue "Tables created in Hive/Beeswax before HDFS is converted to HA become inaccessible after a failover" in the CDH4 Release Notes for more information.

Severity: Medium

Anticipated Resolution: None.

Workaround: Run the "Update Hive Metastore NameNodes" command under the Hive service.

— Impala cannot be used with Federated HDFS

If your cluster is configured to use Federated HDFS, Impala queries will fail.

Severity: Low

Anticipated Resolution: To be fixed in a future release.

Workaround: None.

— Links from the HBase Master Web UI to RegionServer Web UIs may be incorrect.

In order for the links from the HBase Master Web UI to the RegionServer Web UIs to be correct, all the RegionServer Web UI ports must be the same. These can be different from default value of 60030, but all must use the same port number. For the RegionServer Web UI port configuration, roletype and role level values should all be the same.

Severity: Low

Anticipated Resolution: None.

Workaround: Links from Cloudera Manager to the RegionServer Web UIs will be correct, and can be used to access the RegionServer Web UIs if the web ports cannot be the same.

— If HDFS uses Quorum-based Storage without HA enabled, the SecondaryNameNode cannot checkpoint.

If HDFS is set up in non-HA mode, but with Quorum-based storage configured, the dfs.namenode.edits.dir is automatically configured to the Quorum-based Storage URI. However, the SecondaryNameNode cannot currently read the edits from a Quorum-based Storage URI, and will be unable to do a checkpoint.

Severity: Medium

Anticipated Resolution: To be fixed in a future release

Workaround: Add to the NameNode's safety valve the dfs.namenode.edits.dir property with both the value of the Quorum-based Storage URI as well as a local directory, and restart the NameNode. For example,

<property> <name>dfs.namenode.edits.dir</name>
<value>qjournal://jn1HostName:8485;jn2HostName:8485;jn3HostName:8485/journalhdfs1,file:///dfs/edits</value>
</property>

— Changing the rack configuration may temporarily cause mis-replicated blocks to be reported.

A rack re-configuration will cause HDFS to report mis-replicated blocks until HDFS rebalances the system, which may take some time. This is a normal side-effect of changing the configuration.

Severity: Low

Anticipated Resolution: None

Workaround: None

— Starting HDFS with HA and Automatic Failover enabled, one of the NameNodes might not start.

When starting an HDFS service with High Availability and Automatic Failover enabled, one of the NameNodes might might not start up.

Severity: Low

Anticipated Resolution: To be fixed in a future release

Workaround: To fix this, start the NameNode that failed to start up after the remaining HDFS roles start up.

— Cannot use '/' as a mount point with a Federated HDFS Nameservice.

A Federated HDFS Service doesn't support nested mount points, so it is impossible to mount anything at '/'. Note that because of this issue, the root directory will always be read-only, and any client application that requires a writeable root directory will fail.

Severity: Low

Anticipated Resolution: To be fixed in a future release

Workaround:
  1. In the CDH4 HDFS Service > Configuration tab of the Cloudera Manager Admin Console, search for "nameservice".
  2. In the Mountpoints field, change the mount point from "/" to a list of mount points that are in the namespace that the Nameservice will manage. (You can enter this as a comma-separated list - for example, "/hbase, /tmp, /user" or by clicking the plus icon to add each mount point in its own field.) You can determine the list of mount points by running the command hadoop fs -ls / from the CLI on the NameNode host.

— In the HDFS service, the default value for the Superuser Group setting has changed.

The default value for the Superuser Group setting (dfs.permissions.supergroup and dfs.permissions.superusergroup) has changed. In Cloudera Manager 3.7, the default value was hadoop. In Cloudera Manager 4.0, the default value is now superuser.

Severity: Low

Anticipated Resolution: None

Workaround: If necessary, you can change the value for the Superuser Group by setting it in the HDFS service > Configuration tab of the Cloudera Manager Admin Console.

— After upgrading to CM 4.1, roles may need to be restarted for Log Directory Monitoring to work.

After upgrading to Cloudera Manager 4.1, directory monitoring may show status "UNKNOWN" until roles are restarted. You can either restart the roles, or just ignore the unknown status until the next planned restart.

Severity: Low

Anticipated Resolution: None.

Workaround: None.

— Historical disk usage reports do not work with federated HDFS.

Severity: Low

Anticipated Resolution: To be fixed in a future release.

Workaround: None.

— (Applies to CDH4 only) Activity monitoring does not work on YARN activities.

Severity: Low

Anticipated Resolution: To be fixed in a future release

Workaround: None

— (Applies to CDH3 only) Uninstalling Oozie components in the wrong order will cause the uninstall to fail.

If you uninstall hue-oozie-auth-plugin (which was originally installed with Cloudera Manager 3.7) after uninstalling Oozie, the uninstall hue-oozie-auth-plugin operation will fail and the hue-oozie-auth-plugin package will not be uninstalled.

Severity: Low

Anticipated Resolution: None

Work-around: Uninstall hue-oozie-auth-plugin before uninstalling Oozie. If you already attempted to uninstall hue-oozie-auth-plugin after Oozie, you must reinstall Oozie, uninstall hue-oozie-auth-plugin, and then uninstall Oozie again.

— HDFS monitoring configuration applies to all Nameservices

The monitoring configurations at the HDFS level apply to all Nameservices. So, if there are two federated Nameservices, it's not possible to disable a check on one but not the other. Likewise, it's not possible to have different thresholds for the two Nameservices.

Severity: Low

Anticipated Resolution: To be fixed in a future release

Workaround: None

— Task details don't appear for CDH4 MR jobs in the Activity Monitor.

In the Activity Monitor, clicking on the Task details for a job sometimes returns "No results found. Try expanding the time range". This is because there is a time lag between when the Activity information appears and its Task details are available.

Severity: Low

Anticipated Resolution: To be fixed in a future release.

Workaround: Wait for bit and try again – results can take up to a full minute to appear.

— In CDH 4.0 and 4.1, for secure clusters only, Hue cannot connect to the Hive Metastore Server.

Severity: Med

Anticipated Resolution: Fixed in CDH4.2.

Workaround: There are three workarounds:

  • Upgrade to CDH4.2.

  • Use Hue's safety valve for hive-site.xml to configure Hue to directly connect to the Hive Metastore database. These configurations can easily be found by going to the Hive service, selecting a Hive Metastore Server, navigating to the processes page, expanding "show", then clicking on hive-site.xml. You should include the following:
    <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>JDBC_URL</value> 
    </property> 
    <property> 
    <name>javax.jdo.option.ConnectionDriverName</name> 
    <value>DRIVER_NAME</value> 
    </property> <property> 
    <name>javax.jdo.option.ConnectionUserName</name> 
    <value>HIVE_DB_USER</value> 
    </property> <property> 
    <name>javax.jdo.option.ConnectionPassword</name> 
    <value>HIVE_DB_PASSWORD</value> 
    </property> 
    <property> 
    <name>hive.metastore.local</name> 
    <value>true</value> 
    </property> 
    <property> 
    <name>datanucleus.autoCreateSchema</name> 
    <value>false</value> 
    </property> 
    <property> 
    <name>datanucleus.metadata.validate</name> 
    <value>false</value> 
    </property> 
    <property> 
    <name>hive.warehouse.subdir.inherit.perms</name> 
    <value>true</value> 
    </property>
  • Select the "Bypass Hive Metastore" option in Hive service configuration, in the Advanced group. This is not the preferred solution because this configures any Hive CLI to bypass the Hive Metastore Server, even though Hive CLI works with Hive Metastore Server.