Proof of Concept Installation Using Embedded Database on Centos

These are the steps to install a demonstration cluster. This configuration is not for production use. It uses the embedded Postgresql database and does not scale to meet the needs of a production cluster. The proof of concept installation lets you try out CDH and services to familiarize yourself with the system.

Step 1: Prepare Hosts

Even a proof-of-concept deployment has hardware requirements that are not insignificant. For this example, Cloudera uses the following configuration in Google Compute Engine. Use a configuration with similar memory and processing settings.
  • 3 Primary instances: n1-standard-2 (2 CPUs, 8GB RAM).
  • 1 Secondary (master) instance: n1-standard-4 (4 CPUs, 15GB RAM).

Step 2: Prepare Databases

For this proof of concept example, use the embedded PostgreSQL database. The production examples use external databases. That is the primary difference between proof of concept versus production deployments: production deployments must use an external database.

Step 3: Install Cloudera Manager

Download and install Cloudera Manager to your Secondary (master) instance.

  1. Connect to your primary host as the root user with ssh:
    $ ssh root@your.server.com
  2. Download the Cloudera Manager installer.
    wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
  3. Change permissions to make the binary file executable.
    chmod u+x cloudera-manager-installer.bin
  4. Run the installer.
    sudo ./cloudera-manager-installer.bin
  5. Follow the prompts in the installer wizard to install Cloudera Manager.
    1. Click Next to detect the Cloudera Manager host and install the package repository for Cloudera and the Java Runtime Environment.
    2. Review and accept the Cloudera Express license.
    3. Review and accept the Oracle JRE license.
    4. When the installation is complete, copy the HTTP address of your Cloudera Manager Server (http://your.server.com:7180).
    5. Click OK twice to complete the installation.

Step 4: Start the Cloudera Manager Server

  1. In a web browser, go to http://your.server.com:7180.
  2. Accept the End-User License Terms and Conditions.
  3. Choose the Cloudera Express (free) license.
  4. Click Continue.
  5. Review the products that will be installed. Click Continue.
  6. Search for the cluster host names using a pattern. For example, the following pattern:
    poc-install-[1-4].vpc.acmecornproducts.com
    locates the cluster hosts
    poc-install-1.vpc.acmecornproducts.com
    poc-install-2.vpc.acmecornproducts.com
    poc-install-3.vpc.acmecornproducts.com
    poc-install-4.vpc.acmecornproducts.com
  7. Click Continue.

Step 5: Select and Start Services

  1. Accept all defaults on Cluster Installation page 1. Click Continue.
  2. Check the box to Install Oracle Java SE Development Kit (JDK). Click Continue.
  3. Do not enable Single User Mode. Click Continue.
  4. Enter the password and confirmation (for example, cloudera) SSH Login Credentials. Click Continue. Wait for the cluster installation to complete. Click Continue.
  5. Wait for installation of your selected parcels to complete. Click Continue.
  6. Wait for the validation of your cluster to complete. Click Finish.

Step 6: Set Up Your Cluster

  1. On Cluster Setup page 1, choose Core Hadoop. Click Continue.
  2. On page 2, accept the default role assignments. Click Continue.
  3. On the Database Setup page, choose Use Embedded Database. Copy and paste the passwords for Hive, Hue, and the Oozie Server for future reference.
  4. Click Test Connection.
  5. After the connections are validated, click Continue.
  6. Review your changes and click Continue.
  7. Wait for the first run command to complete. Click Continue.
  8. Click Finish, and enjoy your new CDH Cluster.