Issues Fixed in CDH 5.0.x
Issues Fixed in CDH 5.0.6
Upstream Issues Fixed
- HDFS-7960 - The full block report should prune zombie storages even if they're not empty
- HDFS-7278 - Add a command that allows sysadmins to manually trigger full block reports from a DN
- HDFS-6831 - Inconsistency between hdfs dfsadmin and hdfs dfsadmin -help
- HDFS-7596 - NameNode should prune dead storages from storageMap
- HDFS-7208 - NN does not schedule replication when a DN storage fails
- HDFS-7575 - Upgrade should generate a unique storage ID for each volume
- YARN-570 - Time strings are formatted in different timezone
- YARN-2251 - Avoid negative elapsed time in JHS/MRAM web UI and services
- HIVE-8874 - Error Accessing HBase from Hive via Oozie on Kerberos 5.0.1 cluster
- SOLR-6268 - HdfsUpdateLog has a race condition that can expose a closed HDFS FileSystem instance and should close its FileSystem instance if either inherited close method is called.
- SOLR-6393 - Improve transaction log replay speed on HDFS.
- SOLR-6403 - TransactionLog replay status logging.
Issues Fixed in CDH 5.0.5
“POODLE” Vulnerability on TLS/SSL enabled ports
The POODLE (Padding Oracle On Downgraded Legacy Encryption) attack takes advantage of a cryptographic flaw in the obsolete SSLv3 protocol, after first forcing the use of that protocol. The only solution is to disable SSLv3 entirely. This requires changes across a wide variety of components of CDH and Cloudera Manager in all current versions. CDH 5.0.5 provides these changes for CDH 5.0.x deployments.
For more information, see the Cloudera Security Bulletin.
Apache Hadoop Distributed Cache Vulnerability
The Distributed Cache Vulnerability allows a malicious cluster user to expose private files owned by the user running the YARN NodeManager process. For more information, see the Cloudera Security Bulletin.
Upstream Issues Fixed
- HADOOP-11243 - SSLFactory shouldn't allow SSLv3
- HDFS-7274 - Disable SSLv3 in HttpFS
- HDFS-7391 - Reenable SSLv2Hello in HttpFS
- HBASE-12376 - HBaseAdmin leaks ZK connections if failure starting watchers (ConnectionLossException)
- HIVE-8675 - Increase thrift server protocol test coverage
- HIVE-8827 - Remove SSLv2Hello from list of disabled protocols protocols
- HUE-2438 - [core] Disable SSLv3 for Poodle vulnerability
- OOZIE-2034 - Disable SSLv3 (POODLEbleed vulnerability)
- OOZIE-2063 - Cron syntax creates duplicate actions
Issues Fixed in CDH 5.0.4
Upstream Issues Fixed
Issues Fixed in CDH 5.0.3
The following topics describe known issues fixed in CDH 5.0.3. See What's New in CDH 5.0.x for a list of the most important upstream problems fixed in this release.
YARN Fair Scheduler's Cluster Utilization Threshold check is broken
Workaround: Set the yarn.scheduler.fair.preemption.cluster-utilization-threshold property in yarn-site.xml to -1.
When Oozie is configured to use MRv1 and TLS/SSL, YARN / MRv2 libraries are erroneously included in the classpath instead
This problem causes much of the configured Oozie functionality to be unusable.
Workaround: Use a different configuration (non-TLS/SSL or YARN), if possible.
Upstream Issues Fixed
Issues Fixed in CDH 5.0.2
The following topics describe known issues fixed in CDH 5.0.2. See What's New in CDH 5.0.x for a list of the most important upstream problems fixed in this release.
CDH 5 clients running releases 5.0.1 and earlier cannot use WebHDFS to connect to a CDH 4 cluster
Found 21 items ls: Invalid value for webhdfs parameter "op": No enum const class org.apache.hadoop.hdfs.web.resources.GetOpParam.Op.GETACLSTATUS
Workaround: None; note that this is fixed as of CDH 5.0.2.
Endless Compaction Loop
If an empty HFile whose max timestamp is past its TTL (time-to-live) is selected for compaction, it is compacted into another empty HFile, which is selected for compaction, creating an endless compaction loop.
Upstream Issues Fixed
Issues Fixed in CDH 5.0.1
NameNode LeaseManager may crash
Some group mapping providers can cause the NameNode to crash
In certain environments, some group mapping providers can cause the NameNode to segfault and crash.
Workaround: Configure either ShellBasedUnixGroupsMapping in Hadoop or configure SSSD in the operating system on the NameNode.
CREATE TABLE AS SELECT (CTAS) does not work with Parquet files
CREATE TABLE test_data(column1 string); LOAD DATA LOCAL INPATH './data.txt' OVERWRITE INTO TABLE test_data; CREATE TABLE parquet_test ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat' OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat' AS SELECT column1 FROM test_data; SELECT * FROM parquet_test; SELECT column1 FROM parquet_test;
CREATE TABLE parquet_test (column1 string) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat' OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'; INSERT OVERWRITE TABLE parquet_test SELECT * from test_data;
The oozie-workflow-0.4.5 schema has been removed
Workflows using schema 0.4.5 will no longer be accepted by Oozie because this schema definition version has been removed.
Workaround: Use schema 0.5. It's backwards compatible with 0.4.5, so updating the workflow is as simple as changing the schema version number.
Upstream Issues Fixed
- HADOOP-10442 - Group look-up can cause segmentation fault when a certain JNI-based mapping module is used.
- HADOOP-10456 - Bug in Configuration.java exposed by Spark (ConcurrentModificationException)
- HDFS-5064 - Standby checkpoints should not block concurrent readers
- HDFS-6039 - Uploading a File under a Dir with default ACLs throws "Duplicated ACLFeature"
- HDFS-6094 - The same block can be counted twice towards safe mode threshold
- HDFS-6231 - DFSClient hangs infinitely if using hedged reads and all eligible DataNodes die
- HIVE-6495 - TableDesc.getDeserializer() should use correct classloader when calling Class.forName()
- HIVE-6575 - select * fails on parquet table with map data type
- HIVE-6648 - Fixed permission inheritance for multi-partitioned tables
- HIVE-6740 - Fixed addition of Avro JARs to classpath
- HUE-2061 - Task logs are not retrieved if containers not on the same host
- OOZIE-1794 - java-opts and java-opt in the Java action don't always work properly in YARN
- SOLR-5608 - Frequently reproducible failures in CollectionsAPIDistributedZkTest#testDistribSearch
- YARN-1924 - STATE_STORE_OP_FAILED happens when ZKRMStateStore tries to update app(attempt) before storing it
Issues Fixed in CDH 5.0.0
AsyncHBaseSink does not work in CDH 5 Beta 1 and CDH 5 Beta 2
Workaround: Use the HBASE sink (org.apache.flume.sink.hbase.HBaseSink) to write to HBase in CDH 5 Beta releases.
DataNode can consume 100 percent of one CPU
A narrow race condition can cause one of the threads in the DataNode process to get stuck in a tight loop and consume 100 percent of one CPU.
Workaround: Restart the DataNode process.
HDFS NFS gateway does not work with Kerberos-enabled clusters
Cannot browse filesystem via NameNode Web UI if any directory has the sticky bit set
When listing any directory which contains an entry that has the sticky bit permission set, for example /tmp is often set this way, nothing will appear where the list of files or directories should be.
Workaround: Use the Hue File Browser.
Appending to a file that has been snapshotted previously will append to the snapshotted file as well
If you append content to a file that exists in snapshot, the file in snapshot will have the same content appended to it, invalidating the original snapshot.
Bug: See also HDFS-5343
In MRv2 (YARN), the JobHistory Server has no information about a job if the ApplicationMasters fails while the job is running
An empty rowkey is treated as the first row of a table
An empty rowkey is allowed in HBase, but it was treated as the first row of the table, even if it was not in fact the first row. Also, multiple rows with empty rowkeys caused issues.
Workaround: Do not use empty rowkeys.
Hive queries that combine multiple splits and query large tables fail on YARN
java.io.IOException: Max block location exceeded for split: InputFormatClass: org.apache.hadoop.mapred.TextInputFormat splitsize: 21 maxsize: 10 at org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:540) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
Workaround: Set mapreduce.job.max.split.locations to a high value such as 100.
Files in Avro tables no longer have .avro extension
As of CDH 4.3.0 Hive no longer creates files in Avro tables with the .avro extension by default. This does not cause any problems in Hive, but could affect downstream components such as Pig, MapReduce, or Sqoop 1 that expect files with the .avro extension.
Workaround: Manually set the extension to .avro before using a query that inserts data into your Avro table. Use the following set statement:
Oozie does not work seamlessly with ResourceManager HA
Oozie workflows are not recovered on ResourceManager failover when ResourceManager HA is enabled. Further, users cannot specify the clusterId for JobTracker to work against either ResourceManager.
Workaround: On non-secure clusters, users are required to specify either of the ResourceManagers' host:port. For secure clusters, users are required to specify the Active ResourceManager's host:port.
When using Oozie HA with security enabled, some znodes have world ACLs
Oozie High Availability with security enabled will still work, but a malicious user or program can alter znodes used by Oozie for locking, possibly causing Oozie to be unable to finish processing certain jobs.
Oozie and Sqoop 2 may need additional configuration to work with YARN
In CDH 5, MRv2 (YARN) MapReduce 2.0 is recommended over the Hadoop 0.20-based MRv1. The default configuration may not reflect this in Oozie and Sqoop 2 in CDH 5 Beta 2, however, unless you are using Cloudera Manager.
Workaround: Check the value of CATALINA_BASE in /etc/oozie/conf/oozie-env.sh (if you are running an Oozie server) and /etc/default/sqoop2-server (if you are using a Sqoop 2 server). You should also ensure that CATALINA_BASE is correctly set in your environment if you are invoking /usr/bin/sqoop2-server directly instead of using the service init scripts. For Oozie, CATALINA_BASE should be set to /usr/lib/oozie/oozie-server for YARN, or /usr/lib/oozie/oozie-server-0.20 for MRv1. For Sqoop 2, CATALINA_BASE should be set to /usr/lib/sqoop2/sqoop-server for YARN, or /usr/lib/sqoop2/sqoop-server-0.20 on MRv1.
Creating cores using the web UI with default values causes the system to become unresponsive
You can use the Solr Server web UI to create new cores. If you click Create Core without making any changes to the default attributes, the server may become unresponsive. Checking the log for the server shows a repeated error that begins:
ERROR org.apache.solr.cloud.Overseer: Exception in Overseer main queue loop java.lang.IllegalArgumentException: Path must not end with / character
Workaround: To avoid this issue, do not create cores without first updating values for the new core in the web UI. For example, you might enter a new name for the core to be created.
If you created a core with default settings and are seeing this error, you can address the problem by finding which node is having problems and removing that node. Find the problematic node by using a tool that can inspect ZooKeeper, such as the Solr Admin UI. Using such a tool, examine items in the ZooKeeper queue, reviewing the properties for the item. The problematic node will have an item in its queue with the property collection="".
Remove the node with the item with the collection="" property using a ZooKeeper management tool. For example, you can remove nodes using the ZooKeeper command line tool or recent versions of HUE.