Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Sign in or complete our product interest form to continue.

Please Read and Accept our Terms


The recommended tool for installing Cloudera Enterprise

This download installs Cloudera Enterprise or Cloudera Express.

 

Cloudera Enterprise requires a license; however, when installing Cloudera Express you will have the option to unlock Cloudera Enterprise features for a free 60-day trial.

 

Once the trial has concluded, the Cloudera Enterprise features will be disabled until you obtain and upload a license.

 

Note: All CDH hosts that make up a logical cluster need to run on the same major OS release to be covered by Cloudera Support. Cloudera Manager needs to run on the same OS release as one of the CDH clusters it manages, to be covered by Cloudera Support. The risk of issues caused by running different minor OS releases is considered lower than the risk of running different major OS releases. Cloudera recommends running the same minor release cross-cluster, because it simplifies issue tracking and supportability.

 

CDH 5 provides 64-bit packages for RHEL-compatible, SLES, Ubuntu, and Debian systems as listed below.

 

Operating System Version
Red Hat Enterprise Linux (RHEL)-compatible
RHEL (+ SELinux mode in available versions) 7.2, 7.1, 6.8, 6.7, 6.6, 6.5, 6.4, 5.11, 5.10, 5.7
CentOS (+ SELinux mode in available versions) 7.2, 7.1, 6.8, 6.7, 6.6, 6.5, 6.4, 5.11, 5.10, 5.7
Oracle Enterprise Linux (OEL) with Unbreakable Enterprise Kernel (UEK)

7.2 (UEK R2), 7.1, 6.8 (UEK R3), 6.7 (UEK R3),

6.6 (UEK R3), 6.5 (UEK R2, UEK R3),

6.4 (UEK R2), 5.11, 5.10, 5.7

SLES
SUSE Linux Enterprise Server (SLES)

12 with Service Pack 1,

11 with Service Pack 4,

11 with Service Pack 3,

11 with Service Pack 2

Hosts running Cloudera Manager Agents must use SUSE Linux Enterprise Software Development Kit 11 SP1.
Ubuntu/Debian
Ubuntu

Trusty 14.04 - Long-Term Support (LTS)

Precise 12.04 - Long-Term Support (LTS)

Debian

Jessie 8.4, 8.2

Wheezy 7.8, 7.1, 7.0

 

Important: Cloudera supports RHEL 7 with the following limitations:

 

Note:

  • Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS provider.
  • CDH 5.9 DataNode hosts with EMC® DSSD™ D5™ are supported by RHEL 6.6, 7.1, and 7.2.

 

Selected tab: SupportedOperatingSystems

The version of Oracle JDK supported by Cloudera Manager depends on the version of CDH being managed.For more information see CDH and Cloudera Manager Supported JDK Versions.

 

Cloudera Manager can install Oracle JDK 1.7.0_67 during installation and upgrade. If you prefer to install the JDK yourself, follow the instructions in Java Development Kit Installation.

Selected tab: SupportedJDKVersions

The Cloudera Manager Admin Console, which you use to install, configure, manage, and monitor services, supports the following browsers:

  • Mozilla Firefox 24 and 31.
  • Google Chrome 36 and higher.
  • Internet Explorer 9 and higher. Internet Explorer 11 Native Mode.
  • Safari 5 and higher.
Selected tab: SupportedBrowsers

Cloudera Manager requires several databases. The Cloudera Manager Server stores information about configured services, role assignments, configuration history, commands, users, and running processes in a database of its own. You must also specify a database for the Activity Monitor and Reports Manager roles.

Important: When you restart processes, the configuration for each of the services is redeployed using information saved in the Cloudera Manager database. If this information is not available, your cluster does not start or function correctly. You must schedule and maintain regular backups of the Cloudera Manager database to recover the cluster in the event of the loss of this database.

The database you use must be configured to support UTF8 character set encoding. The embedded PostgreSQL database installed when you follow Installation Path A - Automated Installation by Cloudera Manager (Non-Production Mode) automatically provides UTF8 encoding. If you install a custom database, you might need to enable UTF8 encoding. The commands for enabling UTF8 encoding are described in each database topic under Cloudera Manager and Managed Service Datastores.

After installing a database, upgrade to the latest patch version and apply any other appropriate updates. Available updates may be specific to the operating system on which it is installed.

Cloudera supports the shipped version of MariaDB, MySQL and PostgreSQL for each supported Linux distribution.

 

Component MariaDB MySQL SQLite PostgreSQL Oracle Derby - see Note 5
Cloudera Manager 5.5, 10 5.6, 5.5, 5.1 9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1 12c, 11gR2  
Oozie 5.5, 10 5.6, 5.5, 5.1

9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1

See Note 3

12c, 11gR2 Default
Flume Default (for the JDBC Channel only)
Hue 5.5, 10 5.6, 5.5, 5.1

See Note 6

Default

9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1

See Note 3

12c, 11gR2
Hive/Impala 5.5, 10 5.6, 5.5, 5.1

See Note 1

9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1

See Note 3

12c, 11gR2 Default
Sentry 5.5, 10 5.6, 5.5, 5.1

See Note 1

9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1

See Note 3

12c, 11gR2
Sqoop 1 5.5, 10 See Note 4 See Note 4 See Note 4
Sqoop 2 5.5, 10 See Note 9 Default

 

 

Note:

  1. Cloudera supports the databases listed above provided they are supported by the underlying operating system on which they run.
  2. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
  3. Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
  4. PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
  5. For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  6. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation guide for recommendations.
  7. CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.
  8. When installing a JDBC driver, only the ojdbc6.jar file is supported for both Oracle 11g R2 and Oracle 12c; the ojdbc7.jar file is not supported.
  9. Sqoop 2 lacks some of the features of Sqoop 1. Cloudera recommends you use Sqoop 1. Use Sqoop 2 only if it contains all the features required for your use case.
  10. MariaDB 10 is supported only on CDH 5.9 and higher.
Selected tab: SupportedDatabases

The following versions of CDH and managed services are supported:

 

Warning: Cloudera Manager 5 does not support CDH 3 and you cannot upgrade Cloudera Manager 4 to Cloudera Manager 5 if you have a cluster running CDH 3. Therefore, to upgrade CDH 3 clusters to CDH 4 using Cloudera Manager, you must use Cloudera Manager 4.

  • CDH 4 and CDH 5. The latest released versions of CDH 4 and CDH 5 are strongly recommended. For information on CDH 4 requirements, see CDH 4 Requirements and Supported Versions. For information on CDH 5 requirements, see CDH 5 Requirements and Supported Versions.
  • Cloudera Impala - Cloudera Impala is included with CDH 5. Cloudera Impala 1.2.1 with CDH 4.1.0 or higher. For more information on Impala requirements with CDH 4, see Impala Requirements.
  • Cloudera Search - Cloudera Search is included with CDH 5. Cloudera Search 1.2.0 with CDH 4.6.0. For more information on Cloudera Search requirements with CDH 4, see Cloudera Search Requirements.
  • Apache Spark - 0.90 or higher with CDH 4.4.0 or higher.
  • Apache Accumulo - 1.4.3 with CDH 4.3.0, 1.4.4 with CDH 4.5.0, and 1.6.0 with CDH 4.6.0.

For more information, see the Product Compatibility Matrix.

Selected tab: SupportedCDHandManagedServiceVersions

See CDH and Cloudera Manager Supported Transport Layer Security Versions.

 

To configure TLS security for the Cloudera Manager Server and Agents, see Configuring TLS Security for Cloudera Manager.

Selected tab: SupportedTransportLayerSecurityVersions

Cloudera Manager requires the following resources:

  • Disk Space
    • Cloudera Manager Server
      • 5 GB on the partition hosting /var.
      • 500 MB on the partition hosting /usr.
      • For parcels, the space required depends on the number of parcels you download to the Cloudera Manager Server and distribute to Agent hosts. You can download multiple parcels of the same product, of different versions and different builds. If you are managing multiple clusters, only one parcel of a product/version/build/distribution is downloaded on the Cloudera Manager Server—not one per cluster. In the local parcel repository on the Cloudera Manager Server, the approximate sizes of the various parcels are as follows:
        • CDH 5 (which includes Impala and Search) - 1.5 GB per parcel (packed), 2 GB per parcel (unpacked)
        • Impala - 200 MB per parcel
        • Cloudera Search - 400 MB per parcel
    • Cloudera Management Service -The Host Monitor and Service Monitor databases are stored on the partition hosting /var. Ensure that you have at least 20 GB available on this partition.
    • Agents - On Agent hosts, each unpacked parcel requires about three times the space of the downloaded parcel on the Cloudera Manager Server. By default, unpacked parcels are located in /opt/cloudera/parcels.
  • RAM - 4 GB is recommended for most cases and is required when using Oracle databases. 2 GB might be sufficient for non-Oracle deployments with fewer than 100 hosts. However, to run the Cloudera Manager Server on a machine with 2 GB of RAM, you must tune down its maximum heap size (by modifying -Xmx in /etc/default/cloudera-scm-server). Otherwise the kernel might kill the Server for consuming too much RAM.
  • Python - Cloudera Manager requires Python 2.4 or higher (but is not compatible with Python 3.0 or higher). Hue in CDH 5 and package installs of CDH 5 require Python 2.6 or 2.7. All supported operating systems include Python version 2.4 or higher. Cloudera Manager is compatible with Python 2.4 through the latest version of Python 2.x. Cloudera Manager does not support Python 3.0 and higher.
  • Perl - Cloudera Manager requires perl.
Selected tab: ResourceRequirements

The hosts in a Cloudera Manager deployment must satisfy the following networking and security requirements:

  • CDH requires IPv4. IPv6 is not supported and must be disabled.

    See also

  • Multihoming CDH or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.
  • Although some subareas of the product might work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues can arise because multihoming is not covered by the test matrix outside the Cloudera-certified partner appliances.
  • Cluster hosts must have a working network name resolution system and correctly formatted /etc/hostsfile. All cluster hosts must have properly configured forward and reverse host resolution through DNS. The /etc/hosts files must:
    • Contain consistent information about hostnames and IP addresses across all hosts
    • Not contain uppercase hostnames
    • Not contain duplicate IP addresses

    Cluster hosts must not use aliases, either in /etc/hosts or in configuring DNS. A properly formatted /etc/hosts file should be similar to the following example:

    127.0.0.1 localhost.localdomain localhost
    192.168.1.1 cluster-01.example.com cluster-01
    192.168.1.2 cluster-02.example.com cluster-02
    192.168.1.3 cluster-03.example.com cluster-03

  • In most cases, the Cloudera Manager Server must have SSH access to the cluster hosts when you run the installation or upgrade wizard. You must log in using a root account or an account that has password-less sudo permission. For authentication during the installation and upgrade procedures, you must either enter the password or upload a public and private key pair for the root or sudo user account. If you want to use a public and private key pair, the public key must be installed on the cluster hosts before you use Cloudera Manager.

    Cloudera Manager uses SSH only during the initial install or upgrade. Once the cluster is set up, you can disable root SSH access or change the root password. Cloudera Manager does not save SSH credentials, and all credential information is discarded when the installation is complete.

  • If single user mode is not enabled, the Cloudera Manager Agent runs as root so that it can make sure the required directories are created and that processes and files are owned by the appropriate user (for example, the hdfs and mapred users).
  • No blocking is done by Security-Enhanced Linux (SELinux).Note: Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. However, Cloudera does not support use of SELinux with Cloudera Navigator. Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS provider.
  • No blocking by iptables or firewalls; port 7180 must be open because it is used to access Cloudera Manager after installation. Cloudera Manager communicates using specific ports, which must be open.
  • For RHEL and CentOS, the /etc/sysconfig/network file on each host must contain the hostname you have just set (or verified) for that host.
  • Cloudera Manager and CDH use several user accounts and groups to complete their tasks. The set of user accounts and groups varies according to the components you choose to install. Do not delete these accounts or groups and do not modify their permissions and rights. Ensure that no existing systems prevent these accounts and groups from functioning. For example, if you have scripts that delete user accounts not in a whitelist, add these accounts to the list of permitted accounts. Cloudera Manager, CDH, and managed services create and use the following accounts and groups:

Users and Groups

Component (Version)

Unix User ID Groups Notes
Cloudera Manager (all versions) cloudera-scm cloudera-scm Cloudera Manager processes such as the Cloudera Manager Server and the monitoring roles run as this user.

The Cloudera Manager keytab file must be named cmf.keytab since that name is hard-coded in Cloudera Manager.Note: Applicable to clusters managed by Cloudera Manager only.

Apache Accumulo (Accumulo 1.4.3 and higher) accumulo accumulo Accumulo processes run as this user.
Apache Avro   No special users.
Apache Flume (CDH 4, CDH 5) flume flume The sink that writes to HDFS as this user must have write privileges.
Apache HBase (CDH 4, CDH 5) hbase hbase The Master and the RegionServer processes run as this user.
HDFS (CDH 4, CDH 5) hdfs hdfs, hadoop The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.
Apache Hive (CDH 4, CDH 5) hive hive

The HiveServer2 process and the Hive Metastore processes run as this user.

A user must be defined for Hive access to its Metastore DB (for example, MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml.

Apache HCatalog (CDH 4.2 and higher, CDH 5) hive hive

The WebHCat service (for REST access to Hive functionality) runs as the hive user.

HttpFS (CDH 4, CDH 5) httpfs httpfs

The HttpFS service runs as this user. See HttpFS Security Configuration for instructions on how to generate the merged httpfs-http.keytab file.

Hue (CDH 4, CDH 5) hue hue

Hue services run as this user.

Hue Load Balancer (Cloudera Manager 5.5 and higher) apache apache The Hue Load balancer has a dependency on the apache2 package that uses the apache user name. Cloudera Manager does not run processes using this user ID.
Cloudera Impala (CDH 4.1 and higher, CDH 5) impala impala, hive Impala services run as this user.
Apache Kafka (Cloudera Distribution of Kafka 1.2.0) kafka kafka Kafka services run as this user.
Java KeyStore KMS (CDH 5.2.1 and higher) kms kms The Java KeyStore KMS service runs as this user.
Key Trustee KMS (CDH 5.3 and higher) kms kms The Key Trustee KMS service runs as this user.
Key Trustee Server (CDH 5.4 and higher) keytrustee keytrustee The Key Trustee Server service runs as this user.
Kudu kudu kudu Kudu services run as this user.
Llama (CDH 5) llama llama Llama runs as this user.
Apache Mahout   No special users.
MapReduce (CDH 4, CDH 5) mapred mapred, hadoop Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos.
Apache Oozie (CDH 4, CDH 5) oozie oozie The Oozie service runs as this user.
Parquet   No special users.
Apache Pig   No special users.
Cloudera Search (CDH 4.3 and higher, CDH 5) solr solr The Solr processes run as this user.
Apache Spark (CDH 5) spark spark The Spark History Server process runs as this user.
Apache Sentry (CDH 5.1 and higher) sentry sentry The Sentry service runs as this user.
Apache Sqoop (CDH 4, CDH 5) sqoop sqoop This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.
Apache Sqoop2 (CDH 4.2 and higher, CDH 5) sqoop2 sqoop, sqoop2 The Sqoop2 service runs as this user.
Apache Whirr   No special users.
YARN (CDH 4, CDH 5) yarn yarn, hadoop Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos.
Apache ZooKeeper (CDH 4, CDH 5) zookeeper zookeeper The ZooKeeper processes run as this user. It is not configurable.

Selected tab: NetworkingandSecurityRequirements
Selected tab: SystemRequirements

What's New in Cloudera Manager 5.9.0

 

  • Creating Virtual Machine Images

    Documentation has been added with procedures to create virtual images of Cloudera Manager and cluster hosts. See Creating Virtual Images of Cluster Hosts.

  • Security
    • External/Cloud account configuration in Cloudera Manager

      Account configuration for access to Amazon Web Services is now available through the centralized UI menu External Accounts.

    • Key Trustee Server rolling restart

      Key Trustee Server now supports rolling restart.

  • Backup and Disaster Recovery
    • You can now replicate HDFS files and Hive data to and from an Amazon S3 instance. See HDFS Replication to Amazon S3 and Hive Replication To and From Amazon S3.
    • There are some new tuning options to improve performance of HDFS replication. See HDFS Replication Tuning.
    • You can now download performance data about HDFS replication jobs from the Replication Schedules and Replication History pages. See Monitoring the Performance of HDFS Replications.
    • Hive replication now stores Hive UDFs in the Hive metastore. Replication of Impala and Hive User Defined Functions (UDFs).
    • The user interface for creating replication schedules has been reorganized to present the configuration options on three tabs: General, Resources, and Advanced.
    • Uncheck Replicate Impala Metadata by default

      When creating a Hive replication schedule, the option Replicate Impala Metadata was checked (true) by default. In Cloudera Manager 5.9 and higher, the value is unchecked (false) by default.

    • YARN BDR enhancement

      YARN jobs now include the BDR schedule ID that launched the job so you can connect logs with existing schedules, if multiple schedules exist.

  • Resource Management
    • Custom Cluster Utilization Reports

      Documentation has been added to create custom Cluster Utilization reports that you can export data from. See Creating a Custom Cluster Utilization Report.

    • New settings for continuous scheduling

      For new installs, default values for configurations have been changed. yarn_scheduler_fair_continuous_scheduling_enabled is set to false. resourcemanager_fair_scheduler_assign_multiple is set to 'true'. Existing settings are preserved when you upgrade from a lower version.

    • YARN historical reports by user show pool-user entity

      When Cloudera Manager manages multiple clusters, there is no per user tracking for historical applications and queries across clusters. Instead, Historical Applications by User and Historical Queries by User show applications and queries per user and pool. (A pool is associated with a specific cluster.)

    • Directory Usage Report needs export capability

      Directory usage reports can be exported as a CSV file.

  • Cloudera Manager Admin Console User Interface
    • Service colors

      A new set of colors is used to represent each kind of service.

    • Move the table sorting icon to the right

      The table sorting icon now appears consistently on the right hand side of each column.

    • Improved Configuration Diff Display

      Changes displayed in the configuration history page are much more user friendly. For a large section of changed text, Cloudera Manager generates a diff between the old and the new and displays the diff.

      When a user changes only the password, Cloudera Manager does not show the delta: both the old and the new passwords are masked out before the comparison is performed.

    • Move actions menu to the top header

      The actions menu now appears next to the entity title.

    • Move Federation and High Availability to a separate page

      The Federation and High Availability sections used to appear on the HDFS Instances page of an HDFS service. They have been moved to a new page called Federation and High Availability. There is a link from the existing Instances page to this new page.

    • Remove repeated heading below the second level navigation

      Subtitles below the second level navigation tabs are removed because they repeated the content in the tabs.

    • Move maintenance mode and badges to the title area

      Maintenance mode, staleness badges now appear next to the title of the entity.

    • Express wizard allows you to add Kafka

      Kafka is now listed in the custom services when you click the Add Cluster button.

  • Cloudera Manager API
    • Add update_user to Python API client

      Added the update_user() method to the Python API client api_client.py.

    • Expose API endpoint to add a specific path

      New API endpoints have been added that allow users to add, list and remove Watched Directories in HDFS service.

  • Logging
    • Include host in log file name

      Kafka log4j log files now include the host name in the format kafka-broker-${host}.log. Similarly, MirrorMaker logs now include the host name in the format kafka-mirrormaker-${host}.log. Due to the log file name change, when you upgrade Cloudera Manager it no longer recognizes your old log files in log search, though they are still present on disk.

    • Configuration changes to Cloudera Manager audit log

      Cloudera Manager displays the History and Rollback support for the Cloudera Manager Settings. (Administration > Settings). This helps you to track the changes made by an administrator so that Cloudera Support can provide better service when certain Cloudera Manager administrative settings are modified.

  • Diagnostic Bundles
    • Show the Diagnostic Bundle Redaction Policy using the redaction config

      You can specify what information should be redacted in the diagnostic bundle in the UI using Administration > Settings > Redaction Parameters for Diagnostic Bundles.

  • Upgrade
    • Report that a simple restart was performed if rolling restart could not be performed

      Informs you when a simple restart is performed instead of rolling restart on a service because rolling restart is not available.

  • Oozie
    • Provide dump / load functionality for Oozie DB

      The Actions menu in the Oozie service has two new commands, Dump Database and Load Database. These commands make it easier to migrate an Oozie database to another database supported by Oozie. The Dump Database command exports Oozie's database to a file (configurable by Database Dump File setting). Load Database loads the file into a database.

    • Install Oozie ShareLib permissions change

      Install Oozie ShareLib Command assigns correct permissions to the uploaded libraries. This prevents breaking Oozie workflows with a custom umask setting.

  • Configuration Changes
    • Solr zkClientTimeout option

      Added the zkClientTimeout parameter for ZooKeeper.

    • Add JHIST compression as a configuration option

      Added a new option for setting the file format used by an ApplicationMaster when generating the .jhist file.

    • Enable heap dump by default for all daemons

      Starting in version 5.9, when you configure roles that are JVM based, the Dump Heap When Out of Memory configuration parameter defaults to true. An upgrade from a pre-5.9 version maintains your pre-5.9 settings.

    • Cloudera Manager support for client-side YARN graceful decommissioning

      Adds the ability to perform a graceful decommission on YARN NodeManager roles whereby the Node Manager is not assigned new containers, and waits for any currently running applications to finish before being decommissioned unless a timeout occurs. You can configure the timeout using the Node Manager Graceful Decommission Timeout configuration property in the YARN Service. The default behavior has not changed, and continues to be a non-graceful decommission. Affects Cloudera Manager 5.9.0 and higher, and CDH 5.9.0 and higher.

    • Deploy Client Configuration command details page now shows stdout/stderr

      stdout and stderr log links are now shown in the UI when there is a failure while deploying client configurations.

    • Make EXTRA_RATIO configurable for Headlamp indexing

      Added the configuration parameter, Extra Space Ratio for Indexing, to Reports Manager. You can use the parameter to make the speed of indexing faster by allocating additional memory.

    • Configure HBase Indexer to wait longer for ZooKeeper to come up

      The default amount of time that HBase Indexer roles attempts to connect to ZooKeeper has been increased from 30 to 60 seconds. This default can be adjusted by setting a new Cloudera Manager configuration parameter, HBase Indexer ZooKeeper Session Timeout.

  • Embedded database mode improvements

    In version 5.9 and higher, Cloudera Manager can clearly identify whether or not a customer is using the embedded PostgreSQL database. Cloudera does not recommend the embedded database for production use, and requests that customers deploy production systems using an external database. The diagnostic bundles now contain information about whether or not a customer is using the embedded PostgreSQL database. Support can then reach out to customers accordingly.

    If Cloudera Manager is configured to use the embedded PostgreSQL database, a yellow banner appears in the UI recommending that you upgrade to a supported external database.

  • Fix CatalogServiceClient to handle TLS connections to catalogd for UDF replication

    When Impala uses SSL, we now support TLS Connection to Catalog Server. Customers can enable replication for any Impala UDFs/Metadata (in Hive Replication) in Cloudera Manager 5.9 and higher.

  • Do not show steps that are unreachable (skipped)

    When running wizards from the Cloudera Manager Admin Console that add a cluster, add a service, perform an upgrade, and other tasks, steps do not display when they are not reachable or do not apply to the current configuration.

  • Improve Cloudera Manager provisioning performance on AWS

    Add support for resetting Cloudera Manager GUID/UUID. This is accomplished by checking the UUID file.

    If Cloudera Manager finds the UUID file (/etc/cloudera-scm-server/uuid) and the UUID is different than the GUID in the cm_version table, it updates the GUID in the cm_version table with the contents of the UUID file and removes the UUID file.

  • DSSD
    • Trigger HDFS rolling upgrade command for 5.8 to 5.9 in DSSD mode

      DSSD has implemented a data format change for the DSSD-DN in DHP 1.3 (equivalent of Cloudera Manager 5.9). DSSD relies on the HDFS rolling upgrade mechanism to automatically convert from old data format prior to DHP 1.3 to the new data format. This process is very similar to the HDFS metadata (rolling) upgrade process. You can roll back to the old data format before finalizing the upgrade. After the cluster is upgraded to DHP 1.3 and the data format conversion is complete, you can finalize the upgrade, and the old data format is cleaned up. Cloudera Manager issues the HDFS rolling upgrade commands automatically when upgrading a DSSD cluster from 5.8 to 5.9.

      In Cloudera Manager 5.8, the DSSD DataNodes operating in the Hadoop cluster stores version and ID information on a local disk directory. The default directory is /tmp/hadoop-hdfs. This default directory may be deleted by the OS over time or when the OS is restarted. If the DSSD DataNode process is restarted after the default directory is deleted, it will not be able to locate replicas stored on the DSSD D5 appliance. This will cause HDFS service to report under-replicated or even missing HDFS blocks.

      It is a best practice to change the default directory to a different value such as /var/lib/hadoop-hdfs/dssddn. The directory should be created and configured prior to initiating the installation of the Hadoop cluster:

      1. Log into each host that will run the DataNode process.
      2. Create the directory by executing the command:

        mkdir -p /var/lib/hadoop-hdfs/dssddn

      3. Change the directory ownership by executing the command:

        chown -R hdfs:hadoop /var/lib/hadoop-hdfs

      4. Change the directory permissions by executing the command:

        chmod -R 700 /var/lib/hadoop-hdfs

      5. In Cloudera Manager, configure the property dfs.datanode.data.dir in the HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml section of the configuration page. This property must to be set after the initial setup is completed, but prior to writing any data to HDFS.
      Note: Changing the value of the dfs.datanode.data.dir property after data has been written to HDFS will result is under-replicated or lost HDFS blocks.

       

    • Collect additional DSSD 1.2 metrics

      Cloudera Manager collects the following new metrics from DSSD-DN. Note that these metrics are only available in DHP 1.3 and higher.

      DSSD-DN Metrics

      Metric Description
      hdfs_dssd_usable_capacity This reports a single-datanode-level view of the usable capacity on the D5 to which that particular DSSD-DN is connected.
      hdfs_dssd_used_capacity This reports a single-datanode-level view of used capacity on the D5 to which that particular DSSD-DN is connected.
      hdfs_dssd_object_max_number This reports a single-datanode-level view of the maximum number of blocks that can be created on the D5 to which that particular DSSD-DN is connected.
      hdfs_dssd_object_used_number This reports a single-datanode-level view of the number of blocks created on the D5 to which that particular DSSD-DN is connected.

       

    • Remove "Usable Capacity" in cluster setup wizard in DSSD mode

      The DSSD specific configuration com.dssd.hadoop.floodds.usablecapacity is no longer required by DHP 1.3 and is not emitted by Cloudera Manager for CDH 5.9 and higher. The Usable Capacityconfiguration no longer exists in the HDFS service and setup wizard in Cloudera Manager for CDH 5.9 and Higher.

    • Libflood CPU ID accepting Decimal and alphanumeric values

      Cloudera Manager validates the Flood CPU ID field and only allows comma separated integers or the string "all".

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.