Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

Thank you for choosing CDH, your download instructions are below:

Installation

 

This section introduces options for installing Cloudera Manager, CDH, and managed services. You can install:

  • Cloudera Manager, CDH, and managed services in a Cloudera Manager deployment. This is the recommended method for installing CDH and managed services.
  • CDH 5 into an unmanaged deployment.

 

 

 

 

Cloudera Manager Deployment

 

A Cloudera Manager deployment consists of the following software components:

  • Oracle JDK
  • Cloudera Manager Server and Agent packages
  • Supporting database software
  • CDH and managed service software

This section describes the three main installation paths for creating a new Cloudera Manager deployment and the criteria for choosing an installation path. If your cluster already has an installation of a previous version of Cloudera Manager, follow the instructions in Upgrading Cloudera Manager.

 

The Cloudera Manager installation paths share some common phases, but the variant aspects of each path support different user and cluster host requirements:

  • Demonstration and proof of concept deployments - There are two installation options:
    • Installation Path A - Automated Installation by Cloudera Manager - Cloudera Manager automates the installation of the Oracle JDK, Cloudera Manager Server, embedded PostgreSQL database, and Cloudera Manager Agent, CDH, and managed service software on cluster hosts, and configures databases for the Cloudera Manager Server and Hive Metastore and optionally for Cloudera Management Service roles. This path is recommended for demonstration and proof of concept deployments, but is not recommended for production deployments because its not intended to scale and may require database migration as your cluster grows. To use this method, server and cluster hosts must satisfy the following requirements:
      • Provide the ability to log in to the Cloudera Manager Server host using a root account or an account that has password-less sudo permission.
      • Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See Networking and Security Requirementsfor further information.
      • All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the necessary installation files.
    • Installation Path B - Manual Installation Using Cloudera Manager Packages - you install the Oracle JDK and Cloudera Manager Server, and embedded PostgreSQL database packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on cluster hosts: manually install it yourself or use Cloudera Manager to automate installation. However, in order for Cloudera Manager to automate installation of Cloudera Manager Agent packages or CDH and managed service software, cluster hosts must satisfy the following requirements:
      • Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See Networking and Security Requirementsfor further information.
      • All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the necessary installation files.
  • Production deployments - require you to first manually install and configure a production database for the Cloudera Manager Server and Hive Metastore. There are two installation options:
    • Installation Path B - Manual Installation Using Cloudera Manager Packages - you install the Oracle JDK and Cloudera Manager Server packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on cluster hosts: manually install it yourself or use Cloudera Manager to automate installation. However, in order for Cloudera Manager to automate installation of Cloudera Manager Agent packages or CDH and managed service software, cluster hosts must satisfy the following requirements:
      • Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See Networking and Security Requirementsfor further information.
      • All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the necessary installation files.
    • Installation Path C - Manual Installation Using Cloudera Manager Tarballs - you install the Oracle JDK, Cloudera Manager Server, and Cloudera Manager Agent software as tarballs and use Cloudera Manager to automate installation of CDH and managed service software as parcels.

 

 

 

 

Unmanaged Deployment

 

In an unmanaged deployment, you are responsible for managing all phases of the life cycle of CDH and managed service components on each host: installation, configuration, and service life cycle operations such as start and stop. This section describes alternatives for installing CDH 5 software in an unmanaged deployment.

  • Command-line methods:
    • Download and install the CDH 5 "1-click Install" package
    • Add the CDH 5 repository
    • Build your own CDH 5 repository
    If you use one of these command-line methods, the first (downloading and installing the "1-click Install" package) is recommended in most cases because it is simpler than building or adding a repository. See Installing the Latest CDH 5 Release for detailed instructions for each of these options.
  • Tarball You can download a tarball from CDH downloads. Keep the following points in mind:
    • Installing CDH 5 from a tarball installs YARN.
    • In CDH 5, there is no separate tarball for MRv1. Instead, the MRv1 binaries, examples, etc., are delivered in the Hadoop tarball. The scripts for running MRv1 are in the bin-mapreduce1 directory in the tarball, and the MRv1 examples are in the examples-mapreduce1 directory.

 

 

 

 

Please Read and Accept our Terms

CDH 5 provides packages for RHEL-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
RHEL 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
  6.6 in SE Linux mode 64-bit
  6.7 64-bit
  7.1 64-bit
CentOS 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
  6.6 in SE Linux mode 64-bit
  6.7 64-bit
  7.1 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
  7.1 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
Ubuntu/Debian
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0, 7.1) 64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that RPMs work well on Fedora, but this has not been tested.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.

 

Important: Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. However, policies need to be provided by other parties or created by the administrator of the cluster deployment. Cloudera is not responsible for policy support nor policy enforcement, nor for any issues with such. If you experience issues with SELinux, contact your OS support provider.

Important: Cloudera supports RHEL 7 with the following limitations:

 

 

Selected tab: SupportedOperatingSystems
Component MariaDB MySQL SQLite PostgreSQL Oracle Derby - see Note 6
Oozie 5.5 5.5, 5.6 9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Flume Default (for the JDBC Channel only)
Hue 5.5 5.1, 5.5, 5.6

See Note 7

Default 9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Hive/Impala 5.5 5.5, 5.6

See Note 1

9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Sentry 5.5 5.5, 5.6

See Note 1

9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Sqoop 1 5.5 See Note 4 See Note 4 See Note 4
Sqoop 2 5.5 See Note 5 See Note 5 See Note 5 Default

Note:

  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
  2. Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
  3. PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
  4. For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  5. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
  6. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
  7. CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.

 

 

Selected tab: SupportedDatabases

Important: There is one exception to the minimum supported and recommended JDK versions in the following table. If Oracle releases a security patch that affects server-side Java before the next minor release of Cloudera products, the Cloudera support policy covers customers using the patch.

CDH 5.5.x is supported with the versions shown in the following table:

 

Minimum Supported Version Recommended Version Exceptions
1.7.0_25 1.7.0_80 None
1.8.0_31 1.8.0_60 Cloudera recommends that you not use JDK 1.8.0_40.

 

Selected tab: SupportedJDKVersions

Hue

Hue works with the two most recent versions of the following browsers. Cookies and JavaScript must be on.

  • Chrome
  • Firefox
  • Safari (not supported on Windows)
  • Internet Explorer
Hue could display in older versions and even other browsers, but you might not have access to all of its features.

Selected tab: SupportedBrowsers

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: SupportedInternetProtocol

The following components are supported by the indicated versions of Transport Layer Security (TLS):

Table 1. Components Supported by TLS

Component

Role Name Port Version
Flume   Avro Source/Sink 9099 TLS 1.2
HBase Master HBase Master Web UI Port 60010 TLS 1.2
HDFS NameNode Secure NameNode Web UI Port 50470 TLS 1.2
HDFS Secondary NameNode Secure Secondary NameNode Web UI Port 50495 TLS 1.2
HDFS HttpFS REST Port 14000 TLS 1.0
Hive HiveServer2 HiveServer2 Port 10000 TLS 1.2
Hue Hue Server Hue HTTP Port 8888 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Beeswax Port 21000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HiveServer2 Port 21050 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Backend Port 22000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HTTP Server Port 25000 TLS 1.2
Cloudera Impala Impala StateStore StateStore Service Port 24000 TLS 1.2
Cloudera Impala Impala StateStore StateStore HTTP Server Port 25010 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server HTTP Server Port 25020 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server Service Port 26000 TLS 1.2
Oozie Oozie Server Oozie HTTPS Port 11443 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTP Port 8983 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTPS Port 8985 TLS 1.1, TLS 1.2
YARN ResourceManager ResourceManager Web Application HTTP Port 8090 TLS 1.2
YARN JobHistory Server MRv1 JobHistory Web Application HTTP Port 19890 TLS 1.2

 

Selected tab: SupportedTransportLayerSecurityVersions
Selected tab: SystemRequirements

Issues Fixed in CDH 5.5.5

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.5.5:

 

  • FLUME-2821 - KafkaSourceUtil Can Log Passwords at Info remove logging of security related data in older releases
  • FLUME-2913 - Don't strip SLF4J from imported classpaths
  • FLUME-2918 - Speed up TaildirSource on directories with many files
  • HADOOP-8436 - NPE In getLocalPathForWrite ( path, conf ) when the required context item is not configured
  • HADOOP-8437 - getLocalPathForWrite should throw IOException for invalid paths
  • HADOOP-8751 - NPE in Token.toString() when Token is constructed using null identifier
  • HADOOP-8934 - Shell command ls should include sort options
  • HADOOP-10048 - LocalDirAllocator should avoid holding locks while accessing the filesystem
  • HADOOP-10971 - Add -C flag to make `hadoop fs -ls` print filenames only
  • HADOOP-11901 - BytesWritable fails to support 2G chunks due to integer overflow
  • HADOOP-11984 - Enable parallel JUnit tests in pre-commit
  • HADOOP-12252 - LocalDirAllocator should not throw NPE with empty string configuration
  • HADOOP-12259 - Utility to Dynamic port allocation
  • HADOOP-12659 - Incorrect usage of config parameters in token manager of KMS
  • HADOOP-12787 - KMS SPNEGO sequence does not work with WEBHDFS
  • HADOOP-12841 - Update s3-related properties in core-default.xml.
  • HADOOP-12901 - Add warning log when KMSClientProvider cannot create a connection to the KMS server.
  • HADOOP-12963 - Allow using path style addressing for accessing the s3 endpoint.
  • HADOOP-13079 - Add -q option to Ls to print ? instead of non-printable characters
  • HADOOP-13132 - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
  • HADOOP-13155 - Implement TokenRenewer to renew and cancel delegation tokens in KMS
  • HADOOP-13251 - Authenticate with Kerberos credentials when renewing KMS delegation token
  • HADOOP-13255 - KMSClientProvider should check and renew tgt when doing delegation token operations
  • HADOOP-13263 - Reload cached groups in background after expiry.
  • HADOOP-13457 - Remove hardcoded absolute path for shell executable.
  • HDFS-6434 - Default permission for creating file should be 644 for WebHdfs/HttpFS
  • HDFS-7597 - DelegationTokenIdentifier should cache the TokenIdentifier to UGI mapping
  • HDFS-8008 - Support client-side back off when the datanodes are congested
  • HDFS-8581 - ContentSummary on / skips further counts on yielding lock
  • HDFS-8829 - Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning
  • HDFS-8897 - Balancer should handle fs.defaultFS trailing slash in HA
  • HDFS-9085 - Show renewer information in DelegationTokenIdentifier#toString
  • HDFS-9259 - Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario.
  • HDFS-9276 - Failed to Update HDFS Delegation Token for long running application in HA mode
  • HDFS-9365 - Balaner does not work with the HDFS-6376 HA setup.
  • HDFS-9405 - Warmup NameNode EDEK caches in background thread
  • HDFS-9466 - TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
  • HDFS-9700 - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
  • HDFS-9732 - , Improve DelegationTokenIdentifier.toString() for better logging
  • HDFS-9805 - Add server-side configuration for enabling TCP_NODELAY for DataTransferProtocol and default it to true
  • HDFS-9939 - Increase DecompressorStream skip buffer size
  • HDFS-10360 - DataNode may format directory and lose blocks if current/VERSION is missing.
  • HDFS-10381 - DataStreamer DataNode exclusion log message should be warning.
  • HDFS-10396 - Using -diff option with DistCp may get "Comparison method violates its general contract" exception
  • HDFS-10481 - HTTPFS server should correctly impersonate as end user to open file
  • HDFS-10512 - VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks
  • HDFS-10516 - Fix bug when warming up EDEK cache of more than one encryption zone
  • HDFS-10544 - Balancer doesn't work with IPFailoverProxyProvider.
  • HDFS-10643 - Namenode should use loginUser(hdfs) to generateEncryptedKey
  • MAPREDUCE-6442 - Stack trace is missing when error occurs in client protocol provider's constructor
  • MAPREDUCE-6473 - Job submission can take a long time during Cluster initialization
  • MAPREDUCE-6577 - MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
  • YARN-2605 - [RM HA] Rest api endpoints doing redirect incorrectly.
  • YARN-3055 - Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow
  • YARN-3104 - Fixed RM to not generate new AMRM tokens on every heartbeat between rolling and activation
  • YARN-3832 - Resource Localization fails on a cluster due to existing cache directories
  • YARN-4459 - container-executor should only kill process groups
  • YARN-4784 - Fairscheduler: defaultQueueSchedulingPolicy should not accept FIFO.
  • YARN-5048 - DelegationTokenRenewer#skipTokenRenewal may throw NPE
  • YARN-5272 - Handle queue names consistently in FairScheduler.
  • HBASE-11625 - Verifies data before building HFileBlock.
  • HBASE-14155 - StackOverflowError in reverse scan
  • HBASE-14644 - Region in transition metric is broken
  • HBASE-14730 - Region server needs to log warnings when there are attributes configured for cells with hfile v2
  • HBASE-15439 - getMaximumAllowedTimeBetweenRuns in ScheduledChore ignores the TimeUnit
  • HBASE-15496 - Throw RowTooBigException only for user scan/get
  • HBASE-15707 - ImportTSV bulk output does not support tags with hfile.format.version=3
  • HBASE-15746 - Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
  • HBASE-15791 - Improve javadoc around ScheduledChore
  • HBASE-15811 - Batch Get after batch Put does not fetch all Cells
  • HBASE-15925 - Provide default values for hadoop compat module related properties that match default hadoop profile.
  • HBASE-16207 - Can't restore snapshot without "Admin" permission
  • HBASE-16288 - Revert "HFile intermediate block level indexes might recurse forever creating multi TB files"
  • HBASE-16288 - HFile intermediate block level indexes might recurse forever creating multi TB files
  • HIVE-7443 - Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs
  • HIVE-9499 - hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
  • HIVE-10685 - Alter table concatenate oparetor will cause duplicate data
  • HIVE-10925 - Non-static threadlocals in metastore code can potentially cause memory leak
  • HIVE-11031 - ORC concatenation of old files can fail while merging column statistics
  • HIVE-11243 - Changing log level in Utilities.getBaseWork
  • HIVE-11369 - Mapjoins in HiveServer2 fail when jmxremote is used
  • HIVE-11408 - HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used due to constructor caching in Hadoop ReflectionUtils
  • HIVE-11427 - Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079.
  • HIVE-11747 - Unnecessary error log is shown when executing a "INSERT OVERWRITE LOCAL DIRECTORY" cmd in the embedded mode
  • HIVE-11827 - STORED AS AVRO fails SELECT COUNT(*) when empty
  • HIVE-12481 - Occasionally "Request is a replay" will be thrown from HS2
  • HIVE-12635 - Hive should return the latest hbase cell timestamp as the row timestamp value
  • HIVE-12958 - Make embedded Jetty server more configurable
  • HIVE-13285 - Orc concatenation may drop old files from moving to final path
  • HIVE-13462 - HiveResultSetMetaData.getPrecision() fails for NULL columns
  • HIVE-13527 - Using deprecated APIs in HBase client causes zookeeper connection leaks
  • HIVE-13570 - Some queries with Union all fail when CBO is off
  • HIVE-13590 - Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
  • HIVE-13704 - Don't call DistCp.execute() instead of DistCp.run()
  • HIVE-13736 - View's input/output formats are TEXT by default.
  • HIVE-13932 - Hive SMB Map Join with small set of LIMIT failed with NPE
  • HIVE-13953 - Issues in HiveLockObject equals method
  • HIVE-13991 - Union All on view fail with no valid permission on underneath table
  • HIVE-14006 - Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException.
  • HIVE-14118 - Make the alter partition exception more meaningful
  • HUE-3520 - [jb] Fix backport error
  • HUE-3520 - [jb] Use impersonation to access JHS if security is enabled
  • HUE-3637 - [sqoop] Avoid decode errors on attribute values
  • HUE-3650 - [beeswax] Notify of caught errors in the watch logs process
  • HUE-3651 - [core] Upgrade Moment.js
  • HUE-3716 - [core] Add gen-py paths to hue.pth
  • HUE-3861 - [core] Upgrade Django Axes to 1.5
  • HUE-3866 - [core] Hue CPU reaches ~100% usage while uploading files with SSL to HTTPFS/WebHDFS
  • HUE-3880 - [core] Add importlib directly for Python 2.6
  • HUE-4005 - [oozie] Remove oozie.coord.application.path from properties when rerunning workflow
  • HUE-4006 - [oozie] Create new deployment directory when coordinator or bundle is copied
  • HUE-4007 - [oozie] Fix deployement_dir for the bundle in oozie example fixtures
  • HUE-4023 - [useradmin] update AuthenticationForm to allow activated users to login
  • HUE-4087 - [jobbrowser] Unable to kill jobs with Resource Manager HA enabled
  • HUE-4202 - [jb] Enable offset param for fetching jobbrowser logs
  • HUE-4215 - [yarn] Reset API_CACHE on logout
  • HUE-4227 - [yarn] Fix unittest for MR API Cache
  • HUE-4238 - [doc2] Ignore history docs in find_jobs_with_no_doc during sync documents
  • HUE-4252 - [core] Handle 307 redirect from YARN upon standby failover
  • HUE-4258 - [jb] Close and pool Spark History Server connections
  • HUE-4333 - [core] Properly reset API_CACHE on failover
  • HUE-4493 - [oozie] Fix sync-workflow action when Workflow includes sub-workflow
  • HUE-4515 - [oozie] Remove oozie.bundle.application.path from properties when rerunning workflow
  • OOZIE-2314 - Unable to kill old instance child job by workflow or coord rerun by Launcher
  • OOZIE-2329 - Make handling yarn restarts configurable
  • OOZIE-2330 - Spark action should take the global jobTracker and nameNode configs by default and allow file and archive elements
  • OOZIE-2345 - Parallel job submission for forked actions
  • OOZIE-2391 - spark-opts value in workflow.xml is not parsed properly
  • OOZIE-2436 - Fork/join workflow fails with oozie.action.yarn.tag must not be null
  • OOZIE-2481 - Add YARN_CONF_DIR in the Shell action
  • OOZIE-2504 - Create a log4j.properties under HADOOP_CONF_DIR in Shell Action
  • OOZIE-2511 - SubWorkflow missing variable set from option if config-default is present in parent workflow
  • OOZIE-2533 - Patch-1550 - workaround
  • OOZIE-2537 - SqoopMain does not set up log4j properly
  • SENTRY-1190 - IMPORT TABLE silently fails if Sentry is enabled
  • SENTRY-1201 - Sentry ignores database prefix for MSCK statement
  • SENTRY-1252 - grantServerPrivilege and revokeServerPrivilege should treat "*" and "ALL" as synonyms when action is not explicitly specified
  • SENTRY-1265 - Sentry service should not require a TGT as it is not talking to other kerberos services as a client
  • SENTRY-1292 - Reorder DBModelAction EnumSet
  • SENTRY-1293 - Avoid converting string permission to Privilege object
  • SOLR-6631 - DistributedQueue spinning on calling zookeeper getChildren()
  • SOLR-6879 - Have an option to disable autoAddReplicas temporarily for all collections.
  • SOLR-7178 - OverseerAutoReplicaFailoverThread compares Integer objects using ==
  • SOLR-8451 - Fix backport
  • SOLR-8497 - Merge indexes should mark its directories as done rather than keep them around in the directory cache.
  • SOLR-8551 - Make collection deletion more robust.
  • SOLR-8683 - Tune down stream closed logging
  • SOLR-9236 - AutoAddReplicas will append an extra /tlog to the update log location on replica failover.
  • SPARK-10577 - [PYSPARK] DataFrame hint for broadcast join
  • SPARK-11442 - Reduce numSlices for local metrics test of SparkListenerSuite
  • SPARK-12087 - [STREAMING] Create new JobConf for every batch in saveAsHadoopFiles
  • SQOOP-2846 - Sqoop Export with update-key failing for avro data file
Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.