The recommended tool for installing Cloudera Enterprise
This download installs Cloudera Enterprise or Cloudera Express.
Cloudera Enterprise requires a license; however, when installing Cloudera Express you will have the option to unlock Cloudera Enterprise features for a free 60-day trial.
Once the trial has concluded, the Cloudera Enterprise features will be disabled until you obtain and upload a license.
- System Requirements
- What's New
- Documentation
System Requirements
- Supported Operating Systems
- Supported JDK Versions
- Supported Browsers
- Supported Databases
- Supported CDH and Managed Service Versions
- Supported Transport Layer Security Versions
- Resource Requirements
- Networking and Security Requirements
Supported Operating Systems
Supported Operating Systems
Note: Mixed operating system type and version clusters are supported, however using the same version of the same operating system on all cluster hosts is strongly recommended.
Cloudera Manager supports the following 64-bit operating systems:
- RHEL-compatible
- Red Hat Enterprise Linux and CentOS, 64-bit (+ SELinux mode in available versions)
- 5.7
- 5.10
- 6.4
- 6.5
- 6.6
- 6.7
- 7.1
- 7.2
- Oracle Enterprise Linux with default kernel and Unbreakable Enterprise Kernel, 64-bit
- 5.7 (UEK R2)
- 5.10
- 5.11
- 6.4 (UEK R2)
- 6.5 (UEK R2, UEK R3)
- 6.6 (UEK R3)
- 6.7 (UEK R3)
- 7.1
- 7.2
Important: Cloudera supports RHEL 7 with the following limitations:
- Only RHEL 7.2 and 7.1 are supported. RHEL 7.0 is not supported.
- Only new installations of RHEL 7.2 and 7.1 are supported by Cloudera. For upgrades to RHEL 7.1 or 7.2, contact your OS vendor and see Does Red Hat support upgrades between major versions of Red Hat Enterprise Linux?
- Red Hat Enterprise Linux and CentOS, 64-bit (+ SELinux mode in available versions)
- SLES - SUSE Linux Enterprise Server 11, Service Pack 4, 64-bit is supported by CDH 5.7 and higher. Service Packs 2 and 3 are supported by CDH 5.0 through CDH 5.6. Service Pack 1 is not supported by CDH 5, only by CDH 4. Hosts running Cloudera Manager Agents must use SUSE Linux Enterprise Software Development Kit 11 SP1.
- Debian - Wheezy 7.0, 7.1, and 7.8, 64-bit. (Squeeze 6.0 is only supported by CDH 4.)
- Ubuntu - Trusty 14.04 (LTS) and Precise 12.04 (LTS), 64-bit. (Lucid 10.04 is only supported by CDH 4.)
Note:
- Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS provider.
Supported JDK Versions
Supported JDK Versions
The version of Oracle JDK supported by Cloudera Manager depends on the version of CDH being managed. The following table lists the JDK versions supported on a Cloudera Manager 5.7 cluster running the latest CDH 4 and CDH 5. For more information on supported JDK versions for previous versions of Cloudera Manager and CDH, see Compatibility.
Important: There is one exception to the minimum supported and recommended JDK versions listed below. If Oracle releases a security patch that affects server-side Java before the next minor release of Cloudera products, the Cloudera support policy covers customers using the patch
CDH Version Managed (Latest) | Minimum Supported JDK Version | Recommended JDK Version |
---|---|---|
CDH 5 | 1.7.0_55 | 1.7.0_67, 1.7.0_75, 1.7.0_80 |
1.8.0_31 Cloudera recommends that you not use JDK 1.8.0_40. |
1.8.0_60 | |
CDH 4 and CDH 5 | 1.7.0_55 | 1.7.0_67, 1.7.0_75, 1.7.0_80 |
1.8.0_31 | 1.8.0_60 | |
CDH 4 | 1.6.0_31 | 1.7.0_80 |
Cloudera Manager can install Oracle JDK 1.7.0_67 during installation and upgrade. If you prefer to install the JDK yourself, follow the instructions in Java Development Kit Installation.
Supported Browsers
The Cloudera Manager Admin Console, which you use to install, configure, manage, and monitor services, supports the following browsers:
- Mozilla Firefox 24 and 31.
- Google Chrome 36 and higher.
- Internet Explorer 9 and higher. Internet Explorer 11 Native Mode.
- Safari 5 and higher.
Supported Databases
Cloudera Manager requires several databases. The Cloudera Manager Server stores information about configured services, role assignments, configuration history, commands, users, and running processes in a database of its own. You must also specify a database for the Activity Monitor and Reports Manager roles.
Important: When processes restart, the configuration for each of the services is redeployed using information that is saved in the Cloudera Manager database. If this information is not available, your cluster will not start or function correctly. You must therefore schedule and maintain regular backups of the Cloudera Manager database in order to recover the cluster in the event of the loss of this database.
The database you use must be configured to support UTF8 character set encoding. The embedded PostgreSQL database that is installed when you follow Installation Path A - Automated Installation by Cloudera Managerautomatically provides UTF8 encoding. If you install a custom database, you may need to enable UTF8 encoding. The commands for enabling UTF8 encoding are described in each database topic under Cloudera Manager and Managed Service Data Stores.
After installing a database, upgrade to the latest patch version and apply any other appropriate updates. Available updates may be specific to the operating system on which it is installed.
Cloudera Manager and its supporting services can use the following databases:
- MariaDB 5.5
- MySQL - 5.1, 5.5, 5.6, and 5.7
- Oracle 11gR2 and 12c
Note: When installing a JDBC driver, only the ojdbc6.jar file is supported for both Oracle 11g R2 and Oracle 12c; the ojdbc7.jarfile is not supported.
- PostgreSQL - 8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4
Cloudera supports the shipped version of MariaDB, MySQL and PostgreSQL for each supported Linux distribution. Each database is supported for all components in Cloudera Manager and CDH subject to the notes in CDH 4 Supported Databases and CDH 5 Supported Databases.
Supported CDH and Managed Service Versions
The following versions of CDH and managed services are supported:
Warning: Cloudera Manager 5 does not support CDH 3 and you cannot upgrade Cloudera Manager 4 to Cloudera Manager 5 if you have a cluster running CDH 3. Therefore, to upgrade CDH 3 clusters to CDH 4 using Cloudera Manager, you must use Cloudera Manager 4.
- CDH 4 and CDH 5. The latest released versions of CDH 4 and CDH 5 are strongly recommended. For information on CDH 4 requirements, see CDH 4 Requirements and Supported Versions. For information on CDH 5 requirements, see CDH 5 Requirements and Supported Versions.
- Cloudera Impala - Cloudera Impala is included with CDH 5. Cloudera Impala 1.2.1 with CDH 4.1.0 or later. For more information on Impala requirements with CDH 4, see Impala Requirements.
- Cloudera Search - Cloudera Search is included with CDH 5. Cloudera Search 1.2.0 with CDH 4.6.0. For more information on Cloudera Search requirements with CDH 4, see Cloudera Search Requirements.
- Apache Spark - 0.90 or later with CDH 4.4.0 or later.
- Apache Accumulo - 1.4.3 with CDH 4.3.0, 1.4.4 with CDH 4.5.0, and 1.6.0 with CDH 4.6.0.
For more information, see the Product Compatibility Matrix.
Supported Transport Layer Security Versions
The following components are supported by Transport Layer Security (TLS):
Table 1. Components Supported by TLS
Component |
Role | Port | Version |
---|---|---|---|
Cloudera Manager | Cloudera Manager Server | 7182 | TLS 1.2 |
Cloudera Manager | Cloudera Manager Server | 7183 | TLS 1.2 |
Flume | 9099 | TLS 1.2 | |
HBase | Master | 60010 | TLS 1.2 |
HDFS | NameNode | 50470 | TLS 1.2 |
HDFS | Secondary NameNode | 50495 | TLS 1.2 |
Hive | HiveServer2 | 10000 | TLS 1.2 |
Hue | Hue Server | 8888 | TLS 1.2 |
Cloudera Impala | Impala Daemon | 21000 | TLS 1.2 |
Cloudera Impala | Impala Daemon | 21050 | TLS 1.2 |
Cloudera Impala | Impala Daemon | 22000 | TLS 1.2 |
Cloudera Impala | Impala Daemon | 25000 | TLS 1.2 |
Cloudera Impala | Impala StateStore | 24000 | TLS 1.2 |
Cloudera Impala | Impala StateStore | 25010 | TLS 1.2 |
Cloudera Impala | Impala Catalog Server | 25020 | TLS 1.2 |
Cloudera Impala | Impala Catalog Server | 26000 | TLS 1.2 |
Oozie | Oozie Server | 11443 | TLS 1.1 |
Solr | Solr Server | 8983 | TLS 1.1 |
Solr | Solr Server | 8985 | TLS 1.1 |
YARN | ResourceManager | 8090 | TLS 1.2 |
YARN | JobHistory Server | 19890 | TLS 1.2 |
To configure TLS security for the Cloudera Manager Server and Agents, see Configuring TLS Security for Cloudera Manager.
Resource Requirements
Cloudera Manager requires the following resources:
- Disk Space
- Cloudera Manager Server
- 5 GB on the partition hosting /var.
- 500 MB on the partition hosting /usr.
- For parcels, the space required depends on the number of parcels you download to the Cloudera Manager Server and distribute to Agent hosts. You can download multiple parcels of the same product, of different versions and builds. If you are managing multiple clusters, only one parcel of a product/version/build/distribution is downloaded on the Cloudera Manager Server—not one per cluster. In the local parcel repository on the Cloudera Manager Server, the approximate sizes of the various parcels are as follows:
- CDH 4.6 - 700 MB per parcel; CDH 5 (which includes Impala and Search) - 1.5 GB per parcel (packed), 2 GB per parcel (unpacked)
- Cloudera Impala - 200 MB per parcel
- Cloudera Search - 400 MB per parcel
- Cloudera Management Service -The Host Monitor and Service Monitor databases are stored on the partition hosting/var. Ensure that you have at least 20 GB available on this partition. For more information, see Data Storage for Monitoring Data.
- Agents - On Agent hosts each unpacked parcel requires about three times the space of the downloaded parcel on the Cloudera Manager Server. By default unpacked parcels are located in /opt/cloudera/parcels.
- Cloudera Manager Server
- RAM - 4 GB is recommended for most cases and is required when using Oracle databases. 2 GB may be sufficient for non-Oracle deployments with fewer than 100 hosts. However, to run the Cloudera Manager Server on a machine with 2 GB of RAM, you must tune down its maximum heap size (by modifying -Xmx in /etc/default/cloudera-scm-server). Otherwise the kernel may kill the Server for consuming too much RAM.
- Python - Cloudera Manager and CDH 4 require Python 2.4 or later, but Hue in CDH 5 and package installs of CDH 5 require Python 2.6 or 2.7. All supported operating systems include Python version 2.4 or later.
- Perl - Cloudera Manager requires perl.
Networking and Security Requirements
The hosts in a Cloudera Manager deployment must satisfy the following networking and security requirements:
- Cluster hosts must have a working network name resolution system and correctly formatted /etc/hosts file. All cluster hosts must have properly configured forward and reverse host resolution through DNS. The /etc/hosts files must
- Contain consistent information about hostnames and IP addresses across all hosts
- Not contain uppercase hostnames
- Not contain duplicate IP addresses
Also, do not use aliases, either in /etc/hosts or in configuring DNS. A properly formatted /etc/hosts file should be similar to the following example:
127.0.0.1 localhost.localdomain localhost 192.168.1.1 cluster-01.example.com cluster-01 192.168.1.2 cluster-02.example.com cluster-02 192.168.1.3 cluster-03.example.com cluster-03
- In most cases, the Cloudera Manager Server must have SSH access to the cluster hosts when you run the installation or upgrade wizard. You must log in using a root account or an account that has password-less sudo permission. For authentication during the installation and upgrade procedures, you must either enter the password or upload a public and private key pair for the root or sudo user account. If you want to use a public and private key pair, the public key must be installed on the cluster hosts before you use Cloudera Manager.
Cloudera Manager uses SSH only during the initial install or upgrade. Once the cluster is set up, you can disable root SSH access or change the root password. Cloudera Manager does not save SSH credentials, and all credential information is discarded when the installation is complete. For more information, see Permission Requirements for Package-based Installations and Upgrades of CDH.
- If single user mode is not enabled, the Cloudera Manager Agent runs as root so that it can make sure the required directories are created and that processes and files are owned by the appropriate user (for example, the hdfs and mapred users).
- No blocking is done by Security-Enhanced Linux (SELinux).Note: Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS provider.
- IPv6 must be disabled.
- No blocking by iptables or firewalls; port 7180 must be open because it is used to access Cloudera Manager after installation. Cloudera Manager communicates using specific ports, which must be open.
- For RHEL and CentOS, the /etc/sysconfig/network file on each host must contain the hostname you have just set (or verified) for that host.
- Cloudera Manager and CDH use several user accounts and groups to complete their tasks. The set of user accounts and groups varies according to the components you choose to install. Do not delete these accounts or groups and do not modify their permissions and rights. Ensure that no existing systems prevent these accounts and groups from functioning. For example, if you have scripts that delete user accounts not in a whitelist, add these accounts to the list of permitted accounts. Cloudera Manager, CDH, and managed services create and use the following accounts and groups:
Component (Version) |
Unix User ID | Groups | Notes |
---|---|---|---|
Cloudera Manager (all versions) | cloudera-scm | cloudera-scm | Cloudera Manager processes such as the Cloudera Manager Server and the monitoring roles run as this user. The Cloudera Manager keytab file must be named cmf.keytab since that name is hard-coded in Cloudera Manager.
Note: Applicable to clusters managed by Cloudera Manager only.
|
Apache Accumulo (Accumulo 1.4.3 and higher) | accumulo | accumulo | Accumulo processes run as this user. |
Apache Avro | No special users. | ||
Apache Flume (CDH 4, CDH 5) | flume | flume | The sink that writes to HDFS as this user must have write privileges. |
Apache HBase (CDH 4, CDH 5) | hbase | hbase | The Master and the RegionServer processes run as this user. |
HDFS (CDH 4, CDH 5) | hdfs | hdfs, hadoop | The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it. |
Apache Hive (CDH 4, CDH 5) | hive | hive | The HiveServer2 process and the Hive Metastore processes run as this user. A user must be defined for Hive access to its Metastore DB (e.g. MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml. |
Apache HCatalog (CDH 4.2 and higher, CDH 5) | hive | hive | The WebHCat service (for REST access to Hive functionality) runs as the hive user. |
HttpFS (CDH 4, CDH 5) | httpfs | httpfs | The HttpFS service runs as this user. See HttpFS Security Configuration for instructions on how to generate the merged httpfs-http.keytab file. |
Hue (CDH 4, CDH 5) | hue | hue | Hue services run as this user. |
Cloudera Impala (CDH 4.1 and higher, CDH 5) | impala | impala, hive | Impala services run as this user. |
Apache Kafka (Cloudera Distribution of Kafka 1.2.0) | kafka | kafka | Kafka services run as this user. |
Java KeyStore KMS (CDH 5.2.1 and higher) | kms | kms | The Java KeyStore KMS service runs as this user. |
Key Trustee KMS (CDH 5.3 and higher) | kms | kms | The Key Trustee KMS service runs as this user. |
Key Trustee Server (CDH 5.4 and higher) | keytrustee | keytrustee | The Key Trustee Server service runs as this user. |
Kudu | kudu | kudu | Kudu services run as this user. |
Llama (CDH 5) | llama | llama | Llama runs as this user. |
Apache Mahout | No special users. | ||
MapReduce (CDH 4, CDH 5) | mapred | mapred, hadoop | Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos. |
Apache Oozie (CDH 4, CDH 5) | oozie | oozie | The Oozie service runs as this user. |
Parquet | No special users. | ||
Apache Pig | No special users. | ||
Cloudera Search (CDH 4.3 and higher, CDH 5) | solr | solr | The Solr processes run as this user. |
Apache Spark (CDH 5) | spark | spark | The Spark History Server process runs as this user. |
Apache Sentry (incubating) (CDH 5.1 and higher) | sentry | sentry | The Sentry service runs as this user. |
Apache Sqoop (CDH 4, CDH 5) | sqoop | sqoop | This user is only for the Sqoop1 Metastore, a configuration option that is not recommended. |
Apache Sqoop2 (CDH 4.2 and higher, CDH 5) | sqoop2 | sqoop, sqoop2 | The Sqoop2 service runs as this user. |
Apache Whirr | No special users. | ||
YARN (CDH 4, CDH 5) | yarn | yarn, hadoop | Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos. |
Apache ZooKeeper (CDH 4, CDH 5) | zookeeper | zookeeper | The ZooKeeper processes run as this user. It is not configurable. |
What's New
Issues Fixed in Cloudera Manager 5.7.1
Cloudera Manager HDFS usage reports do not include Inode references
Lower versions of Cloudera Manager HDFS usage reports do not include Inode references. As a result, usage reports underreported HDFS directory sizes and data used by users and groups in certain circumstances where HDFS snapshots were used.
Kafka MirrorMaker unable to start due to KAFKA_HOME not being set
Kafka MirrorMaker would not start when Kafka is installed using packages. This occurred because KAFKA_HOME was not set to the correct default when starting MirrorMaker. This issue affected Cloudera Manager 5.4.0 and higher with Kafka 1.4.0 and higher.
Authentication errors occur due to missing SAML metadata
In previous releases, Cloudera Manager SAML metadata was missing an alias and signature. As a result, errors sometimes occur when logging out.
For existing Cloudera Manager installations configured to use SAML for external authentication: Upgrading to Cloudera Manager 5.5.0 and higher includes updated aliases and signatures that resolve this issue.
For installations or upgrades of Cloudera Manager 5.5.0 configured to use SAML for external authentication after upgrade or installation: If errors occur when logging out, update the metadata file in your identity provider (IdP) with the new file from CM Server/saml/metadata.
Child commands for deleting or adding a nameservice show stack trace
In an existing HDFS deployment with high availability, when you try to add or delete a nameservice and attempt to view the progress of the child commands, a stack trace is triggered if some of the child commands have not yet run. This fix eliminates the stack trace and informs you that the child commands have not yet been run.
HiveServer2 Web UI did not use SSL when Kerberos was enabled
SSL configuration for the HiveServer2 Web UI is now used regardless of whether Kerberos is in use.
Clusters menu expands to last cluster viewed
Previously, the Clusters menu expanded the first cluster by default, and as you expand or collapse the menu, Cloudera Manager remembers that cluster for the session. However, when you go to services, roles, or hosts of another cluster, Cloudera Manager does not remember the other cluster and shows the previously expanded cluster instead.
In release 5.7.1 and higher, Cloudera Manager remembers the last cluster viewed by a user and expands that cluster in the Clusters menu by default.
Spark Standalone works when HDFS is not available
The Spark standalone service now works without an HDFS service. As a result, Spark services require a restart after upgrading to Cloudera Manager 5.7.1.
Impala does not throw null pointer exception when memory limit is not set
If the configuration property Impala Daemon Memory Limit was not set, the Impala Admission Control page threw a null pointer exception.
HDFS rolling restart fails after CDH upgrade
In previous Cloudera Manager releases, when one of a pair of highly available NameNodes was down, it was possible for rolling restart or rolling upgrade commands to fail with an error message incorrectly describing the state of the NameNode as "Busy." The error message now correctly identifies the state of the NameNode (typically "Stopped" in this situation).
The Expand Range option did not work for some charts
The Expand range to fill all values option in Chart Builder now works for all charts.
Kafka unable to start due to misconfigured security.inter.broker.protocol when Kerberos is enabled
Kafka would not start when Kerberos is enabled and the default value of security.inter.broker.protocol was not changed. This occurred because Kafka tried to use the same port for SASL_PLAINTEXT and PLAINTEXT. By default, Cloudera Manager now infers the protocol based on the security settings.
This issue affected Cloudera Manager 5.5.2 and higher with Kafka 2.0.0 and higher.
Upgrading to Cloudera Manager 5.7.1 or higher upgrades currently configured values to INFERRED unless SSL/TLS is enabled and the values are currently either PLAINTEXT or SASL_PLAINTEXT. This does not cause any change in behavior.
Oozie JVM heap metrics not reported in Chart Builder for some services
Oozie JVM metrics are now available and display on the role page. They can also be accessed through Chart Builder using theoozie_memory_heap_used and oozie_memory_total_max metrics.
Spark CSD modified
The Spark CSD was modified to avoid conflicts with other CSDs that depend on it. This change causes the Spark service to require a restart when upgrading to Cloudera Manager 5.7.1.
Poorly formed Advanced Configuration Snippets cause null pointer exception with diagnostic bundles
Certain poorly formed Advanced Configuration Snippets could cause a Null Pointer Exception when uploading diagnostic bundles and setting up a Cloudera Manager peer.
Setting owner of a file in Isilon fails
On Isilon systems, the owner that the file is being changed to must be present on the system. In general cases, the user is not present, so this command fails with an error message suggesting that the user is not part of the supergroup. This fix addresses the issue by not failing the command.
The Install Oozie ShareLib command is now visible to users with the Configurator role.
Default location for TLS Keystore for HTTPFS is nonpersistent
The default location for the HTTPFS TLS / SSL keystore was /var/run/hadoop-httpfs/.keystore, which could be deleted when the host reboots. Newly created clusters now have an empty default instead of one that could be deleted. When upgrading to Cloudera Manager 5.7.1 or higher, the old value is maintained, so there is no disruption on upgrade. However, Cloudera Manager warns that the path is in a dangerous location. To fix this problem, move the files to a safe path on that host, and then update the configuration in Cloudera Manager to point to the new path.
Collecting diagnostic bundle displayed Java stack trace
Collecting a diagnostic bundle for a Hive replication schedule caused a Java stack trace to be shown on the page. This fix shows an error message instead of throwing a Java stack trace.
Unable to stop Cloudera Manager Agent on SLES 11
Fixes TSB-144.
Running the restart or stop service commands failed to stop the Agent.
Error creating bean
Occasionally, some users encountered the message Error creating bean with name 'newServiceHandlerRegistry' in the Cloudera Manager Admin Console. This issue has been resolved.
Impala JVM heap size is configurable
The JVM heap size of the Impala catalog server can be configured now using the Java Heap Size of Catalog Server in Bytes property. The property defaults to 4 GB, and like all memory parameters may require tuning.
Documentation
Want to Get Involved or Learn More?
Check out our other resources
Cloudera Community
Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.
Cloudera Educational Services
Receive expert Hadoop training through Cloudera Educational Services, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.