Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

Thank you for choosing CDH, your download instructions are below:

Installation

 

This section introduces options for installing Cloudera Manager, CDH, and managed services. You can install:

  • Cloudera Manager, CDH, and managed services in a Cloudera Manager deployment. This is the recommended method for installing CDH and managed services.
  • CDH 5 into an unmanaged deployment.

 

 

 

 

Cloudera Manager Deployment

 

A Cloudera Manager deployment consists of the following software components:

  • Oracle JDK
  • Cloudera Manager Server and Agent packages
  • Supporting database software
  • CDH and managed service software

This section describes the three main installation paths for creating a new Cloudera Manager deployment and the criteria for choosing an installation path. If your cluster already has an installation of a previous version of Cloudera Manager, follow the instructions in Upgrading Cloudera Manager.

 

The Cloudera Manager installation paths share some common phases, but the variant aspects of each path support different user and cluster host requirements:

  • Demonstration and proof of concept deployments - There are two installation options:
    • Installation Path A - Automated Installation by Cloudera Manager - Cloudera Manager automates the installation of the Oracle JDK, Cloudera Manager Server, embedded PostgreSQL database, and Cloudera Manager Agent, CDH, and managed service software on cluster hosts, and configures databases for the Cloudera Manager Server and Hive Metastore and optionally for Cloudera Management Service roles. This path is recommended for demonstration and proof of concept deployments, but is not recommended for production deployments because its not intended to scale and may require database migration as your cluster grows. To use this method, server and cluster hosts must satisfy the following requirements:
      • Provide the ability to log in to the Cloudera Manager Server host using a root account or an account that has password-less sudo permission.
      • Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See Networking and Security Requirementsfor further information.
      • All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the necessary installation files.
    • Installation Path B - Manual Installation Using Cloudera Manager Packages - you install the Oracle JDK and Cloudera Manager Server, and embedded PostgreSQL database packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on cluster hosts: manually install it yourself or use Cloudera Manager to automate installation. However, in order for Cloudera Manager to automate installation of Cloudera Manager Agent packages or CDH and managed service software, cluster hosts must satisfy the following requirements:
      • Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See Networking and Security Requirementsfor further information.
      • All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the necessary installation files.
  • Production deployments - require you to first manually install and configure a production database for the Cloudera Manager Server and Hive Metastore. There are two installation options:
    • Installation Path B - Manual Installation Using Cloudera Manager Packages - you install the Oracle JDK and Cloudera Manager Server packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on cluster hosts: manually install it yourself or use Cloudera Manager to automate installation. However, in order for Cloudera Manager to automate installation of Cloudera Manager Agent packages or CDH and managed service software, cluster hosts must satisfy the following requirements:
      • Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See Networking and Security Requirementsfor further information.
      • All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the necessary installation files.
    • Installation Path C - Manual Installation Using Cloudera Manager Tarballs - you install the Oracle JDK, Cloudera Manager Server, and Cloudera Manager Agent software as tarballs and use Cloudera Manager to automate installation of CDH and managed service software as parcels.

 

 

 

 

Unmanaged Deployment

 

In an unmanaged deployment, you are responsible for managing all phases of the life cycle of CDH and managed service components on each host: installation, configuration, and service life cycle operations such as start and stop. This section describes alternatives for installing CDH 5 software in an unmanaged deployment.

  • Command-line methods:
    • Download and install the CDH 5 "1-click Install" package
    • Add the CDH 5 repository
    • Build your own CDH 5 repository
    If you use one of these command-line methods, the first (downloading and installing the "1-click Install" package) is recommended in most cases because it is simpler than building or adding a repository. See Installing the Latest CDH 5 Release for detailed instructions for each of these options.
  • Tarball You can download a tarball from CDH downloads. Keep the following points in mind:
    • Installing CDH 5 from a tarball installs YARN.
    • In CDH 5, there is no separate tarball for MRv1. Instead, the MRv1 binaries, examples, etc., are delivered in the Hadoop tarball. The scripts for running MRv1 are in the bin-mapreduce1 directory in the tarball, and the MRv1 examples are in the examples-mapreduce1 directory.

 

 

 

 

Please Read and Accept our Terms

CDH 5 provides packages for RHEL-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
RHEL 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
  6.6 in SE Linux mode 64-bit
  6.7 64-bit
  7.1 64-bit
CentOS 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
  6.6 in SE Linux mode 64-bit
  6.7 64-bit
  7.1 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
  7.1 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
Ubuntu/Debian
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0, 7.1) 64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that RPMs work well on Fedora, but this has not been tested.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.

 

Important: Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. However, policies need to be provided by other parties or created by the administrator of the cluster deployment. Cloudera is not responsible for policy support nor policy enforcement, nor for any issues with such. If you experience issues with SELinux, contact your OS support provider.

Important: Cloudera supports RHEL 7 with the following limitations:

 

 

Selected tab: SupportedOperatingSystems
Component MariaDB MySQL SQLite PostgreSQL Oracle Derby - see Note 6
Oozie 5.5 5.5, 5.6 9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Flume Default (for the JDBC Channel only)
Hue 5.5 5.1, 5.5, 5.6

See Note 7

Default 9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Hive/Impala 5.5 5.5, 5.6

See Note 1

9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Sentry 5.5 5.5, 5.6

See Note 1

9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Sqoop 1 5.5 See Note 4 See Note 4 See Note 4
Sqoop 2 5.5 See Note 5 See Note 5 See Note 5 Default

Note:

  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
  2. Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
  3. PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
  4. For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  5. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
  6. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
  7. CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.

 

 

Selected tab: SupportedDatabases

Important: There is one exception to the minimum supported and recommended JDK versions in the following table. If Oracle releases a security patch that affects server-side Java before the next minor release of Cloudera products, the Cloudera support policy covers customers using the patch.

CDH 5.5.x is supported with the versions shown in the following table:

 

Minimum Supported Version Recommended Version Exceptions
1.7.0_25 1.7.0_80 None
1.8.0_31 1.8.0_60 Cloudera recommends that you not use JDK 1.8.0_40.

 

Selected tab: SupportedJDKVersions

Hue

Hue works with the two most recent versions of the following browsers. Cookies and JavaScript must be on.

  • Chrome
  • Firefox
  • Safari (not supported on Windows)
  • Internet Explorer
Hue could display in older versions and even other browsers, but you might not have access to all of its features.

Selected tab: SupportedBrowsers

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: SupportedInternetProtocol

The following components are supported by the indicated versions of Transport Layer Security (TLS):

Table 1. Components Supported by TLS

Component

Role Name Port Version
Flume   Avro Source/Sink 9099 TLS 1.2
HBase Master HBase Master Web UI Port 60010 TLS 1.2
HDFS NameNode Secure NameNode Web UI Port 50470 TLS 1.2
HDFS Secondary NameNode Secure Secondary NameNode Web UI Port 50495 TLS 1.2
HDFS HttpFS REST Port 14000 TLS 1.0
Hive HiveServer2 HiveServer2 Port 10000 TLS 1.2
Hue Hue Server Hue HTTP Port 8888 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Beeswax Port 21000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HiveServer2 Port 21050 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Backend Port 22000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HTTP Server Port 25000 TLS 1.2
Cloudera Impala Impala StateStore StateStore Service Port 24000 TLS 1.2
Cloudera Impala Impala StateStore StateStore HTTP Server Port 25010 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server HTTP Server Port 25020 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server Service Port 26000 TLS 1.2
Oozie Oozie Server Oozie HTTPS Port 11443 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTP Port 8983 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTPS Port 8985 TLS 1.1, TLS 1.2
YARN ResourceManager ResourceManager Web Application HTTP Port 8090 TLS 1.2
YARN JobHistory Server MRv1 JobHistory Web Application HTTP Port 19890 TLS 1.2

 

Selected tab: SupportedTransportLayerSecurityVersions
Selected tab: SystemRequirements

Issues Fixed in CDH 5.5.2

 

 

Known Issues Fixed

 

The following topics describe known issues fixed in CDH 5.5.2.

 

 

Apache Spark

 

When using Spark on YARN, the driver reports misleading error messages

 

The Spark driver reports misleading error messages such as:

ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@...] ->
[akka.tcp://sparkExecutor@...]: Error [Association failed with [akka.tcp://sparkE xecutor@...]]
[akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@...]]

 

Workaround: Add the following property to the Spark log4j configuration file: log4j.logger.org.apache.spark.rpc.akka.ErrorMonitor=FATAL. See Configuring Spark Application Logging Properties.

 

 

 

Spark does not support rolling upgrades

 

Spark does not support rolling upgrades. Submitted Spark jobs may fail during upgrade. Jobs requiring new configuration properties will fail.

Workaround: Finish the upgrade, and then relaunch the Spark jobs.

 

 

 

 

Hue

 

Cannot query the customers table in Hue

 

To query the customers table, you must re-create the Parquet data for compatibility.

Bug: HUE-3040

Severity: Low

Workaround: Update the parquet file of the customers table (/user/hive/warehouse/customers/customers) with the one attached to HUE-3040.

 

 

 

 

 

Upstream Issues Fixed

 

The following upstream issues are fixed in CDH 5.5.2:

  • AVRO-1781 - Remove LogicalTypes cache.
  • HADOOP-7713 - dfs -count -q should label output column
  • HADOOP-10406 - TestIPC.testIpcWithReaderQueuing may fail
  • HADOOP-10668 - Addendum patch to fix TestZKFailoverController
  • HADOOP-10668 - TestZKFailoverControllerStress#testExpireBackAndForth occasionally fails
  • HADOOP-11171 - Enable using a proxy server to connect to S3a
  • HADOOP-11218 - Add TLS 1.1, TLS 1.2 to KMS, HttpFS, SSLFactory
  • HADOOP-12269 - Update aws-sdk dependency to 1.10.6
  • HADOOP-12417 - TestWebDelegationToken failing with port in use
  • HADOOP-12418 - TestRPC.testRPCInterruptedSimple fails intermittently
  • HADOOP-12464 - Interrupted client may try to fail-over and retry
  • HADOOP-12468 - Partial group resolution failure should not result in user lockout
  • HADOOP-12474 - MiniKMS should use random ports for Jetty server by default
  • HADOOP-12568 - Update core-default.xml to describe posixGroups support
  • HADOOP-12573 - TestRPC.testClientBackOff failing
  • HADOOP-12584 - Disable browsing the static directory in HttpServer2
  • HADOOP-12584 - Revert - Disable browsing the static directory in HttpServer2
  • HADOOP-12584 - Disable browsing the static directory in HttpServer2
  • HADOOP-12604 - Exception may be swallowed in KMSClientProvider
  • HADOOP-12625 - Add a config to disable the /logs endpoints
  • HDFS-6101 - TestReplaceDatanodeOnFailure fails occasionally
  • HDFS-6533 - TestBPOfferService#testBasicFunctionalitytest fails intermittently
  • HDFS-6694 - TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms - debugging patch
  • HDFS-7553 - Fix the TestDFSUpgradeWithHA due to BindException
  • HDFS-7798 - Checkpointing failure caused by shared KerberosAuthenticator
  • HDFS-8647 - Abstract BlockManager's rack policy into BlockPlacementPolicy
  • HDFS-8722 - Optimize DataNode writes for small writes and flushes
  • HDFS-8772 - Fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
  • HDFS-8805 - Archival Storage: getStoragePolicy should not need superuser privilege
  • HDFS-9083 - Replication violates block placement policy
  • HDFS-9123 - Copying from the root to a subdirectory should be forbidden
  • HDFS-9160 - [OIV-Doc] : Missing details of 'delimited' for processor options
  • HDFS-9220 - Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum
  • HDFS-9249 - NPE is thrown if an IOException is thrown in NameNode constructor
  • HDFS-9250 - Add precondition check to LocatedBlock#addCachedLoc
  • HDFS-9268 - fuse_dfs chown crashes when uid is passed as -1
  • HDFS-9273 - ACLs on root directory may be lost after NameNode restart
  • HDFS-9286 - HttpFs does not parse ACL syntax correctly for operation REMOVEACLENTRIES
  • HDFS-9295 - Add a thorough test of the full KMS code path
  • HDFS-9313 - Possible NullPointerException in BlockManager if no excess replica can be chosen
  • HDFS-9332 - Fix Precondition failures from NameNodeEditLogRoller while saving namespace
  • HDFS-9339 - Extend full test of KMS ACLs
  • HDFS-9364 - Unnecessary DNS resolution attempts when creating NameNodeProxies
  • HDFS-9410 - Some tests should always reset sysout and syserr
  • HDFS-9429 - Tests in TestDFSAdminWithHA intermittently fail with EOFException
  • HDFS-9438 - Only collect HDFS-6694 debug data on Linux, Mac, and Solaris
  • HDFS-9445 - DataNode may deadlock while handling a bad volume
  • HDFS-9470 - Encryption zone on root not loaded from fsimage after NameNode restart
  • HDFS-9474 - TestPipelinesFailover should not fail when printing debug message
  • MAPREDUCE-6191 - Improve clearing stale state of Java serialization testcase
  • MAPREDUCE-6233 - org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
  • MAPREDUCE-6549 - multibyte delimiters with LineRecordReader cause duplicate records
  • MAPREDUCE-6550 - archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
  • YARN-3564 - Fix TestContainerAllocation.testAMContainerAllocationWhenDNSUnavailable fails randomly
  • YARN-3768 - ArrayIndexOutOfBoundsException with empty environment variables
  • YARN-4235 - FairScheduler PrimaryGroup does not handle empty groups returned for a user
  • YARN-4310 - FairScheduler: Log skipping reservation messages at DEBUG level
  • YARN-4347 - Resource manager fails with Null pointer exception
  • YARN-4408 - Fix issue that NodeManager still reports negative running containers
  • HBASE-6617 - ReplicationSourceManager should be able to track multiple WAL paths
  • HBASE-12961 - Fix negative values in read and write region server metrics
  • HBASE-13134 - mutateRow and checkAndMutate apis don't throw region level exceptions
  • HBASE-13703 - ReplicateContext should not be a member of ReplicationSource
  • HBASE-13746 - list_replicated_tables command is not listing table in HBase shell
  • HBASE-13833 - LoadIncrementalHFile.doBulkLoad(Path, HTable) doesn't handle unmanaged connections when using SecureBulkLoad
  • HBASE-14003 - Work around JDK-8044053
  • HBASE-14205 - RegionCoprocessorHost System.nanoTime() performance bottleneck
  • HBASE-14283 - Reverse scan doesn’t work with HFile inline index/bloom blocks
  • HBASE-14501 - NPE in replication with TDE
  • HBASE-14533 - Connection Idle time 1 second is too short and the connection is closed too quickly by the ChoreService. Increase it to the default (10 minutes) for testAll(). The patch is not committed upstream yet.
  • HBASE-14541 - TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many splits and few retries
  • HBASE-14547 - Add more debug/trace to zk-procedure
  • HBASE-14621 - ReplicationLogCleaner stuck on RS crash
  • HBASE-14731 - Add -DuseMob option to ITBLL
  • HBASE-14809 - Grant / revoke namespace admin permission to group
  • HBASE-14923 - VerifyReplication should not mask the exception during result comparison
  • HBASE-14926 - Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets stuck reading
  • HBASE-15031 - Fix merge of MVCC and SequenceID performance regression in branch-1.0
  • HBASE-15032 - HBase shell scan filter string assumes UTF-8 encoding
  • HBASE-15035 - Bulkloading HFiles with tags that require splits do not preserve tags
  • HBASE-15104 - Occasional failures due to NotServingRegionException in IT tests
  • HIVE-7575 - GetTables thrift call is very slow
  • HIVE-7653 - Hive AvroSerDe does not support circular references in Schema
  • HIVE-9507 - Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
  • HIVE-10027 - Use descriptions from Avro schema files in column comments
  • HIVE-10048 - JDBC - Support SSL encryption regardless of Authentication mechanism
  • HIVE-10083 - SMBJoin fails in case one table is uninitialized
  • HIVE-10265 - Hive CLI crashes on != inequality
  • HIVE-10514 - Fix MiniCliDriver tests failure
  • HIVE-10687 - AvroDeserializer fails to deserialize evolved union fields
  • HIVE-10697 - ObjectInspectorConvertors#UnionConvertor does a faulty conversion
  • HIVE-11149 - Fix issue with sometimes HashMap in PerfLogger.java hangs
  • HIVE-11288 - Backport:Avro SerDe InstanceCache returns incorrect schema
  • HIVE-11513 - AvroLazyObjectInspector could handle empty data better
  • HIVE-11616 - DelegationTokenSecretManager reuses the same objectstore, which has concurrency issues
  • HIVE-11785 - Revert - Support escaping carriage return and new line for LazySimpleSerDe
  • HIVE-11785 - Support escaping carriage return and new line for LazySimpleSerDe
  • HIVE-11826 - 'hadoop.proxyuser.hive.groups' configuration does not prevent unauthorized user to access metastore
  • HIVE-11977 - Hive should handle an external Avro table with zero length files present
  • HIVE-12008 - Hive queries failing when using count(*) on column in view
  • HIVE-12058 - Change Hive script to record errors when calling hbase fails
  • HIVE-12188 - DoAs does not work properly in non-Kerberos secured HiveServer2
  • HIVE-12189 - The list in pushdownPreds of ppd.ExprWalkerInfo should not be allowed to grow very large
  • HIVE-12218 - Unable to create a like table for an HBase-backed table
  • HIVE-12250 - Zookeeper connection leaks in Hive's HBaseHandler
  • HIVE-12265 - Generate lineage info only if requested
  • HIVE-12268 - Context leaks deleteOnExit paths
  • HIVE-12278 - Skip logging lineage for explain queries
  • HIVE-12287 - Lineage for lateral view shows wrong dependencies
  • HIVE-12330 - Fix precommit Spark test part2
  • HIVE-12365 - Added resource path is sent to cluster as an empty string when externally removed
  • HIVE-12378 - Exception on HBaseSerDe.serialize binary field
  • HIVE-12388 - GetTables cannot get external tables when TABLE type argument is given
  • HIVE-12406 - HIVE-9500 introduced incompatible change to LazySimpleSerDe public interface
  • HIVE-12418 - HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak
  • HIVE-12505 - Backport: Insert overwrite in same encrypted zone silently fails to remove some existing files
  • HIVE-12566 - Incorrect result returns when using COALESCE in WHERE condition with LEFT JOIN
  • HIVE-12713 - Miscellaneous improvements in driver compile and execute logging
  • HIVE-12784 - Group by SemanticException: Invalid column reference
  • HIVE-12788 - Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions
  • HIVE-12795 - Vectorized execution causes ClassCastException
  • HUE-2664 - Revert - [jobbrowser] Fix fetching logs from job history server
  • HUE-2997 - [oozie] Easier usage of email action when workflow fails
  • HUE-3035 - [beeswax] Optimize sample data query for partitioned tables
  • HUE-3036 - [beeswax] Revert get_tables to use Thrift API GetTables
  • HUE-3091 - [oozie] Do not remove extra new lines from email action body
  • IMPALA-1459 - Fix migration/assignment of On-clause predicates inside inline views.
  • IMPALA-2103 - Fix flaky test_impersonation test
  • IMPALA-2113 - Handle error when distinct and aggregates are used with a having clause
  • IMPALA-2225 - Handle error when star based select item and aggregate are incorrectly used
  • IMPALA-2226 - Throw AnalysisError if table properties are too large
  • IMPALA-2273 - Make MAX_PAGE_HEADER_SIZE configurable
  • IMPALA-2473 - Reduce scanner memory usage
  • IMPALA-2535 - PAGG hits mem_limit when switching to I/O buffers
  • IMPALA-2558 - DCHECK in Parquet scanner after block read error
  • IMPALA-2559 - Fix check failed: sorter_runs_.back()->is_pinned_
  • IMPALA-2591 - DataStreamSender::Send() does not return an error status if SendBatch() failed
  • IMPALA-2598 - Re-enable SSL and Kerberos on server-server
  • IMPALA-2612 - Free local allocations once for every row batch when building hash tables
  • IMPALA-2614 - Don't ignore Status returned by DataStreamRecvr::CreateMerger()
  • IMPALA-2624 - Increase fs.trash.interval to 24 hours for test suite
  • IMPALA-2630 - Skip TestParquet.test_continue_on_error when using old aggs/joins
  • IMPALA-2643 - Prevent migrating incorrectly inferred identity predicates into inline views
  • IMPALA-2648 - Avoid sending large partition stats objects over thrift
  • IMPALA-2695 - Fix GRANTs on URIs with uppercase letters
  • IMPALA-2722 - Free local allocations per row batch in non-partitioned AGG and HJ
  • IMPALA-2731 - Refactor MemPool usage in HBase scan node
  • IMPALA-2747 - Thrift-client cleans openSSL state before using it in the case of the catalog
  • IMPALA-2776 - Remove escapechartesttable and associated tests
  • IMPALA-2812 - Remove additional test referencing escapecharstesttable
  • IMPALA-2829 - SEGV in AnalyticEvalNode touching NULL input_stream_
  • KITE-1089 - ReadAvroContainer morphline command should work even if the Avro writer schema of each input file is different
  • KITE-1097 - Add method to read the name of a Morphline command
  • OOZIE-2030 - Configuration properties from global section is not getting set in Hadoop job conf when using sub-workflow action in Oozie workflow.xml
  • OOZIE-2365 - Oozie fails to start when SMTP password not set
  • OOZIE-2380 - Oozie Hive action failed with wrong tmp path
  • OOZIE-2397 - LAST_ONLY and NONE don't properly handle READY actions
  • OOZIE-2413 - Kerberos credentials can expire if the KDC is slow to respond
  • OOZIE-2439 - FS Action no longer uses name-node from global section or default NN
  • OOZIE-2441 - SubWorkflow action with propagate-configuration but no global section throws NPE on submit
  • PIG-3641 - Split "otherwise" producing incorrect output when combined with ColumnPruning
  • SENTRY-565 - Improve performance of filtering Hive SHOW commands
  • SENTRY-835 - Drop table leaves a connection open when using metastorelistener
  • SENTRY-902 - SimpleDBProviderBackend should retry the authorization process properly
  • SENTRY-936 - getGroup and getUser should always return original HDFS values for paths in prefix which are not managed by Sentry
  • SENTRY-944 - Setting HDFS rules on Sentry-managed HDFS paths should not affect original HDFS rules
  • SENTRY-953 - External Partitions which are referenced by more than one table can cause some unexpected behavior with Sentry HDFS sync
  • SENTRY-957 - Exceptions in MetastoreCacheInitializer should probably not prevent HMS from starting up
  • SENTRY-960 - Blacklist reflect, java_method using hive.server2.builtin.udf.blacklist
  • SENTRY-988 - Let SentryAuthorization setter path always fall through and update HDFS
  • SENTRY-994 - SentryAuthorizationInfoX should override isSentryManaged
  • SOLR-6443 - backportDisable test that fails on Jenkins until we can determine the problem
  • SOLR-7049 - LIST Collections API call should be processed directly by the CollectionsHandler instead of the OverseerCollectionProcessor
  • SOLR-7989 - After a new leader is elected it, it should ensure it's state is ACTIVE if it has already registered with ZooKeeper
  • SOLR-8075 - Fix faulty implementation
  • SOLR-8152 - Overseer Task Processor/Queue can miss responses, leading to timeouts
  • SOLR-8223 - Avoid accidentally swallowing OutOfMemoryError
  • SOLR-8288 - DistributedUpdateProcessor#doFinish should explicitly check and ensure it does not try to put itself into LIR
  • SOLR-8353 - Support regex for skipping license checksums
  • SOLR-8367 - Fix the LeaderInitiatedRecovery 'all replicas participate' fail-safe
  • SOLR-8372 - backportCanceled recovery can lead to data loss
  • SOLR-8535 - Support forcing define-lucene-javadoc-url to be local
  • SPARK-5569 - [STREAMING] Fix ObjectInputStreamWithLoader for supporting load array classes
  • SPARK-8029 - Robust shuffle writer
  • SPARK-9735 - [SQL] Respect the user specified schema than the infer partition schema for HadoopFsRelation
  • SPARK-10648 - Oracle dialect to handle nonspecific numeric types
  • SPARK-10865 - [SPARK-10866] [SQL] Fix bug of ceil/floor, which should returns long instead of the Double type
  • SPARK-11105 - [YARN] Distribute log4j.properties to executors
  • SPARK-11126 - [SQL] Fix the potential flaky test
  • SPARK-11126 - [SQL] Fix a memory leak in SQLListener._stageIdToStageMetrics
  • SPARK-11246 - [SQL] Table cache for Parquet broken in 1.5
  • SPARK-11453 - [SQL] Append data to partitioned table will messes up the result
  • SPARK-11484 - [WEBUI] Using proxyBase set by Spark AM
  • SPARK-11786 - [CORE] Tone down messages from akka error monitor
  • SPARK-11799 - [CORE] Make it explicit in executor logs that uncaught exceptions are thrown during executor shutdown
  • SPARK-11929 - [CORE] Make the repl log4j configuration override the root logger
  • SQOOP-2745 - Using datetime column as a splitter for Oracle no longer works
  • SQOOP-2767 - Test is failing SystemImportTest
  • SQOOP-2783 - Query import with parquet fails on incompatible schema
  • SQOOP-2422 - Sqoop2: Test TestJSONIntermediateDataFormat is failing on JDK8

 

 

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.