This is the documentation for Cloudera Manager 4.8.5.
Documentation for other versions is available at Cloudera Documentation.

Installation Path C - Installation Using Tarballs

To avoid using system packages, and to use tarballs (and parcels) instead, follow the instructions in this section.

  Note: When installing with tarballs and parcels, some services may require additional dependencies which are not provided by Cloudera. To determine these dependencies, check the logs if a service fails to start or has errors. The logs should specify whether there are missing dependencies, which you then must install manually.

Before You Begin

Cloudera Manager and Cloudera Distribution of Hadoop (CDH) are comprised of a set of services. These services interact among each other and use databases to complete tasks. The parts that make up this system are very flexible, so you could deploy these services and resources in many different ways, though the process is greatly simplified by following Cloudera's installation and configuration guidelines.

Considering this, Cloudera recommends you begin by establishing a foundation of database resources that can be utilized as they become necessary throughout the installation process. Begin by deploying the necessary supporting services and then proceeding through the installation process.

Install the Oracle JDK

Install the Oracle Java Development Kit (JDK) on each of your cluster hosts where you want to run Hadoop before installing Cloudera's packages. Cloudera Manager can manage both CDH3 and CDH4 hosts, and the required JDK version varies accordingly.

Install Databases for the Cloudera Manager Services

Create and configure databases for the Cloudera Manager Activity Monitor, Service Monitor, Report Manager, and Host Monitor. Cloudera supports various database solutions including MySQL databases or Oracle databases.

Information about how these databases are set up in your environment is required to complete the CDH and Cloudera Manager configuration. The details of what is required varies among database types. Gather this information either as you complete the installations or from database administrators who have the information required. A list of what information is required for each database type is provided in each database section.

Follow the instructions at Installing and Configuring Databases to complete this task.

Database choices

Notes and Instructions

Option A: External PostgreSQL

After PostgreSQL is installed, you need to run a script to prepare a database for the Cloudera Manager Server as described in Installing and Configuring an External PostgreSQL Database.

Option B: External MySQL

You can use the same MySQL application that is used for the monitoring and reporting features, as described in Installing and Configuring a MySQL Database. After MySQL is installed, you need to run a script to prepare a database for the Cloudera Manager Server, as is described later in this topic.

Option C: External Oracle

You can use an external Oracle database for monitoring and reporting features, as described in Using an Oracle Database.

Step 1: Install the Cloudera Manager Server and Agents from Tarballs

Tarballs provide both the Cloudera Manager Server and Cloudera Manager Agents in a single file. Download tarballs from Cloudera Manager Version and Download Information. The files can be unpacked to any location of your choosing. Copy the tarballs, unpack them on all machines on which you intend to install Cloudera Manager Server and Cloudera Manager Agents. If necessary, create a new directory to accommodate the files you extract from the tarball. For instance if /opt/cloudera-manager does not exist, create that using a command similar to:

$ sudo mkdir /opt/cloudera-manager

In the preceding example, files are extracted to a subdirectory named according to the Cloudera Manager version being extracted. For example, files could extract to /opt/cloudera-manager/cm-4.5/. This full path is needed later and is referred to as <tarball root> directory. The <tarball root> directory includes the created Cloudera Manager version number.

When you have a location to which to extract the contents of the tarball, extract the contents. For example, to copy a tar file to your home directory and extract the contents of all tar files to the /opt/ directory, you might use a command similar to the following:

$ tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager

Creating Users

The Cloudera Manager Server and managed services need a user account to complete tasks. When installing Cloudera Manager from tarballs, you much create this user account on all hosts manually. Because Cloudera Manager Server and managed services are configured to use the user account cloudera-scm by default, creating a user with this name is the simplest approach. After creating such a user, it is automatically used after installation is complete.

To create a user cloudera-scm, use a command such as the following:

$ useradd --system --home=/opt/cloudera-manager/cm-4.5/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

For the preceding useradd command, ensure the --home argument path matches your environment. This argument varies according to where you place the tarball and the version number varies among releases. For example, the --home location could be /opt/cm-4.5/run/cloudera-scm-server.

Configuring Cloudera Manager Agents

On every Cloudera Manager Agent host machine, configure the Cloudera Manager Agent to point to the Cloudera Manager Server by setting the following properties in the <tarball root>/etc/cloudera-scm-agent/config.ini configuration file:

Property

Description

server_host

Name of host machine where the Server is running

server_port

Port on host machine where the Server is running

Custom Cloudera Manager Users and Directories

Cloudera Manager is built to use a default set of directories and user accounts. You can use the default locations and accounts, but there is also the option to change these settings. In some cases, changing these settings is required for Cloudera Manager to work. For most installations, you can skip ahead to Step 2: Configure a Database for the Cloudera Manager Server.

Changing Directories that Cloudera Manager Uses

By default, Cloudera Manager services creates directories in /var/log and /var/lib. The directories the Cloudera Manager installer attempts to create are:

  • /var/log/cloudera-scm-headlamp
  • /var/log/cloudera-scm-firehose
  • /var/log/cloudera-scm-alertpublisher
  • /var/log/cloudera-scm-eventserver
  • /var/lib/cloudera-scm-headlamp
  • /var/lib/cloudera-scm-firehose
  • /var/lib/cloudera-scm-alertpublisher
  • /var/lib/cloudera-scm-eventserver

If you are using a custom user and directory for Cloudera Manager, you will need to create these directories on the Cloudera Manager Server host and assign ownership of these directories to your user manually. Issues might arise if any of these directories already exist. The Cloudera Manager installer makes no changes to existing directories. In such a case, Cloudera Manager is unable to write to any existing directories for which it does not have proper permissions and services may not perform as expected.

Two ways to resolve such situations are: Changing the ownership of existing directories or specifying alternate directories for agents. You do not need to complete both procedures.

To change ownership for existing directories:

  1. Change the directory owner to the Cloudera Manager user. If the Cloudera Manager user and group are cloudera-scm and you needed to take ownership of the headlamp log directory, you would issue a command similar to the following:
    $ chown -R cloudera-scm:cloudera-scm /var/log/cloudera-scm-headlamp
  2. Repeat the process of using chown to change ownership for all existing directories to the Cloudera Manager user.

To use alternate directories for services:

  Note:

If you changed ownership of existing directories so the Cloudera Manager user can use them, you do not need to use alternate directories.

  1. If the directories you plan to use do not exist, create them now. For example to create /var/cm_logs/cloudera-scm-headlamp for use by the cloudera-scm user, you might use the following commands:
    mkdir /var/cm_logs/cloudera-scm-headlamp
    chown cloudera-scm /var/cm_logs/cloudera-scm-headlamp
  2. Connect to the Cloudera Manager admin console.
  3. Under the Cloudera Managed Services, click the name of the service.
  4. In the service status page, click Configuration.
  5. In the settings page, enter a term in the Search field to find the settings to be change. For example, you might enter "/var" or "directory".
  6. Update each value with the new locations for Cloudera Manager to use.
  7. Click Save Changes.

Step 2: Configure a Database for the Cloudera Manager Server

To manage the services, Cloudera Manager Agents, and configurations in your cluster, the Cloudera Manager Server stores data in a database. You can either use an existing database or install a new database. After installing the database, you must then run a script to prepare that database for use with the Cloudera Manager Server.

  Note:

The Cloudera Manager Server database is separate from the databases used by the Cloudera Manager Activity Monitor, Service Monitor, Report Manager, Host Monitor, Hive Metastore, and Cloudera Navigator. You should have installed these services' databases in the section above, Install Databases for the Cloudera Manager Services.

In this release, you can use any one of the database options listed in the table below.

  Important:

You do not need to complete all options listed below. After establishing one database for the Cloudera Manager server, move onto the next steps. Do not install all the database options.

Example 1: Running the script when MySQL is installed on another host

This example explains how to run the script on the Cloudera Manager Server machine (myhost2) and create and use a temporary MySQL user account to connect to MySQL remotely on the MySQL machine (myhost1).

  1. On myhost1's MySQL prompt, create a temporary user who can connect from myhost2:
    mysql> grant all on *.* to 'temp'@'%' identified by 'temp' with grant option;
    Query OK, 0 rows affected (0.00 sec)
  2. On the Cloudera Manager Server host (myhost2), run the script:
    $ sudo <tarball root>/share/cmf/schema/scm_prepare_database.sh mysql -h myhost1.sf.cloudera.com -u temp -ptemp --scm-host myhost2.sf.cloudera.com scm scm scm
    Looking for MySQL binary
    Looking for schema files in /usr/share/cmf/schema
    Verifying that we can write to /etc/cloudera-scm-server
    Connecting to mysql at myhost1 as 'temp'
    Creating Cloudera Manager database 'scm'
    Setting up Cloudera Manager user 'scm'@'myhost2.sf.cloudera.com'
    Installing Cloudera Manager schema from file /usr/share/cmf/schema/cmf_schema_00001.ddl
    Installing Cloudera Manager schema from file /usr/share/cmf/schema/cmf_schema_00002.ddl
    Creating Cloudera Manager configuration file in /etc/cloudera-scm-server
    All done, your Cloudera Manager database is ready to go!
  3. On myhost1, delete the temporary user:
    mysql> drop user 'temp'@'%';
    Query OK, 0 rows affected (0.00 sec)

Example 2: Running the script to configure Oracle

This shows an example of running the script to configure an Oracle database.

[root@rhel55-6 ~]# <tarball root>/share/cmf/schema/scm_prepare_database.sh -h cm-oracle.example.com oracle orcl sample_user sample_pass
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.6.0_31/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!

Preparing the Database for the Cloudera Manager Server

Cloudera Manager configuration can be completed using the scm_prepare_database.sh script, which is installed in the <tarball root>/share/cmf/schema/. You must run the script on the Cloudera Manager Server host. After you have installed your database application or collected information about an existing Oracle installation, use the scm_prepare_database.sh script to prepare the database for use with the Cloudera Manager Server. This script enables Cloudera Manager Server to connect to an external database in MySQL, PostgreSQL, or Oracle. The script prepares the database by:

  • Creating the Cloudera Manager Server database configuration file.
  • Creating a database for the Cloudera Manager Server to use. This is optional and is only completed if options are specified.
  • Setting up a user account for the Cloudera Manager Server. This is optional and is only completed if options are specified.

Script syntax

scm_prepare_database.sh database-type [options] database-name username password

Required Parameter

Description

database-type

To connect to a MySQL database, specify mysql as the database type. To connect to an Oracle database, specify oracle. To connect to an external PostgreSQL database, specify postgresql.

database-name

The name of the Cloudera Manager Server database you want to create.

username

The username for the Cloudera Manager Server database you want to create.

password

The password for the Cloudera Manager Server database you want to create. If you don't specify the password on the command line, the script will prompt you to enter it.

Option

Description

-h or --host

The IP address or hostname of the host where MySQL or Oracle is installed. The default is to use the local host.

-P or --port

The port number to use to connect to MySQL or Oracle. The default port is 3306. This option is used for a remote connection only.

-u or --user

The username for the MySQL or Oracle application. The default is root.

-p or --password

The password for the MySQL or Oracle application. The default is no password.

--scm-host

The hostname where the Cloudera Manager Server is installed. Omit if the Cloudera Manager server and MySQL or Oracle are installed on the same host.

--config-path

The path to the Cloudera Manager Server configuration files. The default is /etc/cloudera-scm-server.

--schema-path

The path to the Cloudera Manager schema files. The default is /usr/share/cmf/schema (the location of the script).

-f

The script will not stop if an error is encountered.

-? or --help

Display help.

  Note:

You can also run scm_prepare_database.sh without options to see the syntax.

Step 3: Start the Cloudera Manager Server

  Important:

When you start the Cloudera Manager Server and Agents, Cloudera Manager assumes you are not already running HDFS and MapReduce. If you are, shut down HDFS and MapReduce (service hadoop-0.20-<daemon> stop), and configure the init scripts to not start on boot (for example, chkconfig hadoop-0.20-<daemon> off). Contact Cloudera Support for help converting your existing Hadoop configurations for use with Cloudera Manager.

The way in which you start the Cloudera Manager Server varies according to what account you want the server to run under.

To run as the user invoking the script:

 $ <tarball root>/etc/init.d/cloudera-scm-server start 

To explicitly run the server as root:

 $ sudo <tarball root>/etc/init.d/cloudera-scm-server start 

To run as another user:

If you run as another user, ensure the user you created for Cloudera Manager owns the location to which you extracted the tarball including the newly created database files. If you followed the earlier examples and created the directory /opt/cloudera-manager and the user cloudera-scm, you could use the following command to change ownership of the directory:

$ sudo chown -R cloudera-scm:cloudera-scm /opt/cloudera-manager

Once you have established proper ownership of directory locations, you can start Cloudera Manager Server using the user account you chose. For example, you might run the Cloudera Manager Server as cloudera-service.

In such a case there are two options:

  • Use the following command:
 $ sudo -u <user> <tarball root>/etc/init.d/cloudera-scm-server start 
  • Edit the configuration files so the script internally changes the user. Then run the script as root. To make this possible, complete the following steps:
  1. Remove the following line from <tarball root>/etc/default/cloudera-scm-server:
    export CMF_SUDO_CMD=" "
    Change the user and group in <tarball root>/etc/init.d/cloudera-scm-server to the user you want the server to run as. For example, to run as cloudera-service, change the user and group as follows:
    USER=cloudera-service
    GROUP=cloudera-service
  2. Run the server script as root:
     $ sudo <tarball root>/etc/init.d/cloudera-scm-server start 

To start the Cloudera Manager Server automatically after a reboot:

Run the following commands on the Cloudera Manager Server host.

$ cp <tarball root>/etc/init.d/cloudera-scm-server /etc/init.d/cloudera-scm-server
 $ chkconfig cloudera-scm-server on

Then, on the Cloudera Manager Server host, open the /etc/init.d/cloudera-scm-server file and change the value of CMF_DEFAULTS from ${CMF_DEFAULTS:-/etc/default} to <tarball root>/etc/default.

Step 4: Start the Cloudera Manager Agents

To start the Cloudera Manager Agents:

To start the Cloudera Manager Agent, run this command on each Agent machine:

$ sudo <tarball root>/etc/init.d/cloudera-scm-agent start

When the Agent starts, it contacts the Cloudera Manager Server.

To start the Cloudera Manager Agents automatically after a reboot:

Run the following commands on each Agent machine.

$ cp <tarball root>/etc/init.d/cloudera-scm-agent /etc/init.d/cloudera-scm-agent
 $ chkconfig cloudera-scm-agent on

Then, on each Agent, open the /etc/init.d/cloudera-scm-agent file and change the value of CMF_DEFAULTS from ${CMF_DEFAULTS:-/etc/default} to <tarball root>/etc/default.

Troubleshooting Cloudera Manager Agent Connection Problems

If there is a communication failure between a Cloudera Manager Agent and Cloudera Manager Server, you can use the Cloudera Manager Server log file /var/log/cloudera-scm-server/cloudera-scm-server.log and the Cloudera Manager Agent log files /var/log/cloudera-scm-agent/cloudera-scm-agent.log to troubleshoot the problem. The following is a common error.

Error message

Description

error: (113, 'No route to host') in cloudera-scm-agent.log.

This indicates that the agent is unable to connect to the Cloudera Manager Server. Make sure that iptables and SELinux are both turned off.

Step 5: Start the Cloudera Manager Admin Console

The Cloudera Manager Admin Console enables you to use Cloudera Manager to configure, manage, and monitor Hadoop on your cluster. Before using the Cloudera Manager Admin Console, gather information about the server's URL and port.

The server URL takes the following form:

http://<Server host>:<port>

<Server host> is the fully-qualified domain name or IP address of the host machine where the Cloudera Manager Server is installed. <port> is the port configured for the Cloudera Manager Server. The default port is 7180. For example, use a URL such as the following:

http://myhost.example.com:7180/

Cloudera Manager does not support changing the admin username for the installed account. You can change the password using Cloudera Manager after you run the wizard in the next section. While you cannot change the admin username, you can add a new user, assign administrative privileges to the new user, and then delete the default admin account.

To start the Cloudera Manager Admin Console:

  1. In a web browser, enter the URL, including the port, for the Cloudera Server. The login screen for Cloudera Manager appears.
  2. Log into Cloudera Manager. The default credentials are:

    Username: admin

    Password: admin

Install the License File

  1. When you start the Cloudera Manager Admin Console, the install wizard starts up. Click Continue to get started.
  2. Browse to your Cloudera Manager License file. If you don't install it now, Cloudera Manager Standard Edition will be installed. You can also elect to install the 60-day Trial Edition that gives you access to the full set of Enterprise features for 60 days.
  3. If you install the Cloudera Manager license, you must restart the Cloudera Manager server. From the command line, enter:
    $ sudo <tarball root>/etc/init.d/cloudera-scm-server restart
  4. After the Cloudera Manager server restarts, log in again.
      Note:

    After restarting the server, wait a few seconds for the server to finish initializing before you try to reconnect to the Admin Console.

  5. Click Continue in the next screen.

Step 6: Install CDH Using Parcels

Parcels are a package-less way to install CDH on your cluster. This section walks through installing CDH using parcels, and will let Cloudera Manager download the parcels automatically. See the section on Using Parcels for alternatives.

Once logged into the Cloudera Manager Admin Console, you will be prompted for a set of hosts for your CDH cluster installation.

Choose the Currently Managed Hosts tab, to select the hosts already running the Cloudera Manager Agents started in Step 4. Select all the hosts, and press Continue.

Choose to use an installation using Parcels. Create the parcel repository directory by using:

$ mkdir /opt/cloudera/parcel-repo

The server settings allow you to use a different directory.

You must also chown the directory ownership to be the username you are using to run Cloudera Manager:

$ chown <username> /opt/cloudera/parcel-repo

Choose a version of CDH (and, optionally, Impala and/or Solr) to install on your cluster.

Once you continue, CDH will be downloaded and distributed to all the hosts in your cluster.

Step 7. Configure Services

The following instructions describe how to use the Cloudera Manager wizard to configure and start the Hadoop services.

  Note: After configuring your services, the installation wizard attempts to automatically start the Cloudera Management Services under the assumption that services will run using cloudera-scm. If you configured these services to run using a user other than cloudera-scm, then the Cloudera Management Services do not start automatically. In such a case, change the service configuration to use the user account that you selected. After making this configuration change, manually start the roles, and then begin the process of configuring services.

Changing the Cloudera Manager User

If you specify a user name other than cloudera-scm as the Cloudera Manager user, you must update the places where the default Cloudera Manager user name is specified. This means changing the System User and System Group properties for Cloudera Management Services that uses cloudera-scm.

To update the Cloudera Manager user name:

  1. Connect to the Cloudera Manager admin console.
  2. Go to the Services tab drop-down menu and select Cloudera Management Services. The Status page for the service appears.
  3. From the Configuration tab select View and Edit and use the search box to find the property to be changed. For example, you might enter "system" to find the System User and System Group properties.
  4. Make any changes required to the System User and System Group to ensure Cloudera Manager uses the proper user accounts.
  5. Click Save Changes.

To configure services:

  1. Once you have CDH set up, you can choose the Hadoop services you want to start. Choose one of the standard combinations: Core Hadoop, HBase Services, or All Services; these combinations take into account the dependencies between the Hadoop services. Alternatively, you can choose Custom Services, and select the services individually.
      Note:

    Some services depend on others; for example, HBase requires HDFS and ZooKeeper.

    The Cloudera Management Services, which are added to each package, are Cloudera Manager processes that run to support monitoring and management features in Cloudera Manager.

  2. On the Database Setup page, enter the information requested. If the installation you are upgrading includes existing roles, those roles will not require configuration information. At most you will need to provide information for up to the Activity Monitor, Service Monitor, Report Manager, and Host Monitor databases.
      Important:

    The value you enter as the database hostname must match the value you entered for the hostname (if any) when you created the database (see Installing and Configuring Databases).

    For example, if you entered the following for the Activity Monitor database

    mysql> grant all on activity_monitor.* TO 'amon_user'@'localhost' IDENTIFIED BY 'amon_password';

    the value you enter here for the database hostname must be localhost. On the other hand, if you had entered the following when you created the database

    mysql> grant all on activity_monitor.* TO 'amon_user'@'myhost1.myco.com' IDENTIFIED BY 'amon_password';

    the value you enter here for the database hostname must be myhost1.myco.com. If you did not specify a host, or used a wildcard to allow access from any host, you can enter either the fully-qualified domain name (FQDN) here, or localhost. For example, if you entered

    mysql> grant all on activity_monitor.* TO 'amon_user'@'%' IDENTIFIED BY 'amon_password';

    the value you enter here for the database hostname can be either the FQDN or localhost. Similarly, if you entered

    mysql> grant all on activity_monitor.* TO 'amon_user' IDENTIFIED BY 'amon_password';

    the value you enter here for the database hostname can be either the FQDN or localhost.

  3. Click Test Connection to confirm that Cloudera Manager can communicate with the databases using the information you have supplied. This atypical transaction takes two heartbeats to complete (about 30 seconds with the default heartbeat interval). If the test succeeds in all cases, click Continue; otherwise check and correct the information you have provided for the databases and then try the test again.
  4. Confirm the settings entered for file system paths, such as the NameNode Data Directory and the DataNode Data Directory.
  5. Supply the name of the mail server (it can be localhost), the mail server user, and the mail recipients.
  6. Click Continue. The wizard starts the services on your cluster.
  7. When all of the services are started, click Continue.
  8. Click Continue.

Step 8. Change the Default Administrator Password

As soon as possible after running the wizard and beginning to use Cloudera Manager, you should change the default administrator password.

To change the administrator password:

  1. From the Administration tab, select Users.
  2. Click the Change Password button next to the admin account.
  3. Enter a new password twice and then click Update.

Step 9. Test the Installation

Now that you have finished the CDH4 and Cloudera Manager installation, you are ready to test the installation. For testing instructions, see Testing the Installation.

  Note:

If you change the hostname or port where the Cloudera Manager is running, or you enable TLS security, you must restart the Cloudera Management Services to update the URL to the Server. For instructions, see Restarting a Service.