Known Issues Fixed in CDH 5.0.0
The following topics describe known issues fixed in CDH 5.0.0.
— AsyncHBaseSink does not work in CDH 5 Beta 1 and CDH 5 Beta 2
Workaround: Use the HBASE sink (org.apache.flume.sink.hbase.HBaseSink) to write to HBase in CDH 5 Beta releases.
— DataNode can consume 100 percent of one CPU
A narrow race condition can cause one of the threads in the DataNode process to get stuck in a tight loop and consume 100 percent of one CPU.
Workaround: Restart the DataNode process.
— HDFS NFS gateway does not work with Kerberos-enabled clusters
— Cannot browse filesystem via NameNode Web UI if any directory has the sticky bit set
When listing any directory which contains an entry that has the sticky bit permission set, for example /tmp is often set this way, nothing will appear where the list of files or directories should be.
Workaround: Use the Hue File Browser.
— Appending to a file that has been snapshotted previously will append to the snapshotted file as well
If you append content to a file that exists in snapshot, the file in snapshot will have the same content appended to it, invalidating the original snapshot.
Bug: See also HDFS-5343
— In MRv2 (YARN), the JobHistory Server has no information about a job if the ApplicationMasters fails while the job is running
— Hive queries that combine multiple splits and query large tables fail on YARN
java.io.IOException: Max block location exceeded for split: InputFormatClass: org.apache.hadoop.mapred.TextInputFormat splitsize: 21 maxsize: 10 at org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:540) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
Workaround: Set mapreduce.job.max.split.locations to a high value such as 100.
— Files in Avro tables no longer have .avro extension
As of CDH 4.3.0 Hive no longer creates files in Avro tables with the .avro extension by default. This does not cause any problems in Hive, but could affect downstream components such as Pig, MapReduce, or Sqoop 1 that expect files with the .avro extension.
Workaround: Manually set the extension to .avro before using a query that inserts data into your Avro table. Use the following set statement:
— Oozie does not work seamlessly with ResourceManger HA
Oozie workflows are not recovered on ResourceManager failover when ResourceManager HA is enabled. Further, users can not specify the clusterId for JobTracker to work against either ResourceManager.
Workaround: On non-secure clusters, users are required to specify either of the ResourceManagers' host:port. For secure clusters, users are required to specify the Active ResourceManger's host:port.
— When using Oozie HA with security enabled, some znodes have world ACLs
Oozie High Availability with security enabled will still work, but a malicious user or program can alter znodes used by Oozie for locking, possibly causing Oozie to be unable to finish processing certain jobs.
— Oozie and Sqoop 2 may need additional configuration to work with YARN
In CDH 5, MRv2 (YARN) MapReduce 2.0 is recommended over the Hadoop 0.20-based MRv1. The default configuration may not reflect this in Oozie and Sqoop 2 in CDH 5 Beta 2, however, unless you are using Cloudera Manager.
Workaround: Check the value of CATALINA_BASE in /etc/oozie/conf/oozie-env.sh (if you are running an Oozie server) and /etc/default/sqoop2-server (if you are using a Sqoop 2 server). You should also ensure that CATALINA_BASE is correctly set in your environment if you are invoking /usr/bin/sqoop2-server directly instead of using the service init scripts. For Oozie, CATALINA_BASE should be set to /usr/lib/oozie/oozie-server for YARN, or /usr/lib/oozie/oozie-server-0.20 for MRv1. For Sqoop 2, CATALINA_BASE should be set to /usr/lib/sqoop2/sqoop-server for YARN, or /usr/lib/sqoop2/sqoop-server-0.20 on MRv1.
— Creating cores using the web UI with default values causes the system to become unresponsive
You can use the Solr Server web UI to create new cores. If you click Create Core without making any changes to the default attributes, the server may become unresponsive. Checking the log for the server shows a repeated error that begins:
ERROR org.apache.solr.cloud.Overseer: Exception in Overseer main queue loop java.lang.IllegalArgumentException: Path must not end with / character
Workaround: To avoid this issue, do not create cores without first updating values for the new core in the web UI. For example, you might enter a new name for the core to be created.
If you created a core with default settings and are seeing this error, you can address the problem by finding which node is having problems and removing that node. Find the problematic node by using a tool that can inspect ZooKeeper, such as the Solr Admin UI. Using such a tool, examine items in the ZooKeeper queue, reviewing the properties for the item. The problematic node will have an item in its queue with the property collection="".
Remove the node with the item with the collection="" property using a ZooKeeper management tool. For example, you can remove nodes using the ZooKeeper command line tool or recent versions of HUE.