Introduction to Cloudera Manager Installation
Cloudera Manager automates the installation and configuration of CDH on an entire cluster, requiring only that you have root SSH access to your cluster's machines, and access to the internet or a local repository with installation files for all these machines. Cloudera Manager consists of:
- A small self-executing Cloudera Manager installation program to install the Cloudera Manager Server and other packages in preparation for cluster host installation
- Cloudera Manager wizard for automating CDH installation and configuration on the cluster hosts
- Cloudera Manager features for monitoring and configuring the cluster after installation is completed
Cloudera Manager provides two methods for installing CDH and its associated components: traditional packages (RPMs or Debian packages) or parcels. Parcels are a new packaging format that simplifies the installation process, and more importantly allows you to download, distribute, and activate a new CDH version all from within Cloudera Manager. Parcels are available for CDH4 (4.1.3 and onwards), Cloudera Impala (1.0 and onwards) and Cloudera Search.
About the Cloudera Manager Installation Program
The Cloudera Manager installation program, which you will install on the host where you want the Cloudera Manager Server to run, automatically:
- Installs the package repositories for Cloudera Manager and the Oracle Java Development Kit (JDK)
- Installs the Oracle JDK 1.6 if it is not already installed. Cloudera Manager also supports JDK 1.7 if it is already installed on the cluster hosts.
- Installs the Cloudera Manager Server
- Installs and configures an embedded PostgreSQL database for use by the Cloudera Manager server
About the Cloudera Manager Wizard
After you have installed the Cloudera Manager Server and when you run it for the first time, you can use the Cloudera Manager wizard to do the following on the cluster hosts automatically.
- Using SSH, discover the cluster hosts you specify via IP address ranges or hostnames
- Configure the parcel or package repositories for Cloudera Manager, CDH, Impala, and the Oracle JDK
- Install the Cloudera Manager Agent, CDH, and Impala on the cluster hosts
- Install the Oracle JDK (1.6) if it is not already installed on the cluster hosts. Cloudera Manager also supports JDK 1.7 if it is already installed on the cluster hosts.
- Determine mapping of services to hosts
- Suggest a Hadoop configuration and start the Hadoop services
- If you will use external databases, you must install and configure those databases before you start the wizard. These are the databases that will be used by Cloudera Manager, Service Monitor, Activity Monitor, Host Monitor, Report Manager, Cloudera Navigator, and the Hive Metastore. See Installing and Configuring Databases for more information. If you will use the embedded PostgreSQL database, you do not have to prepare databases in advance.
- When you install or upgrade Cloudera Manager and/or CDH using parcels, only the Cloudera Manager server host needs access to a remote parcel repository. The other cluster hosts only need access to the local repository (by default /opt/cloudera/parcels) on the Cloudera Manager server.
- When you use the Cloudera Manager Wizard to install or upgrade Cloudera Manager and/or CDH on your cluster hosts using packages, all of those hosts need access to installation files.
- Installation files are available on the Internet at archive.cloudera.com or you can download installation files and create a local repository. For more information, see the individual installation and upgrade procedures, and the Cloudera Manager Frequently Asked Questions.
You can choose to abort the Cloudera Manager Agent and CDH installation process and Cloudera Manager wizard will automatically revert and completely rollback the installation process for any uninstalled components. (Installation that has completed successfully on a given host is not rolled back on that host.)
Installing Cloudera Manager for the First Time
To install Cloudera Manager, you will:
- Install a database application on the Cloudera Manager Server host machine or on a host machine that the Cloudera Manager Server can access, and (depending on the configuration you decide on) on other hosts as well.
- Install the Cloudera Manager Server on one cluster host machine.
- Install CDH and the Cloudera Manager Agents on the other cluster host machines.
The following diagram illustrates a simple example of where each component is installed.
If you want to configure security for Cloudera Manager, see the following sections:
For overview and usage information, see the Managing Clusters with Cloudera Manager.