Install Hue on EC2 in AWS

This page explains one way to install CDH and Hue on an EC2 cluster in AWS.

Launch EC2 instances in AWS

This is one way to create a cluster of ec2 instances for installing CDH. Ideally, you have four instances with at least 2 cores and 8 GB of RAM.
  1. Log on to Amazon Web Services and go to the EC2 Dashboard.
  2. Click Launch Instance.
  3. Select a Linux distribution (here we use RedHat 7.3).
  4. Select m3.large (at a minimum).
  5. Click Next: Configure Instance Details.
  6. Increase the Number of Instances to 4 (at a minimum).
  7. Click Next: Add Storage and increase size to 100 GB.
  8. Click Next: Add Tags and name your instances.
  9. Click Next: Configure Security Group and click Add Rule:
    • Select Custom TCP Rule = port 7180
    • Select Custom source = 0.0.0.0/0.
  10. Repeat for other Ports Used by Cloudera Manager and Cloudera Navigator to build a list:
    • 7180: Cloudera Manager http web console
    • 7183: Cloudera Manager https web console
    • 7182: Cloudera Manager listens to agent heartbeats
    • 7432: Embedded PostgreSQL database
    • 9000: Cloudera Manager server and agent communication
    • 9001: Cloudera Manager server and agent communication
  11. Click Review and Launch.
  12. Select Create a new key pair, name it, and click Download Key Pair (or use an existing one).
  13. Click Launch Instances, and when ready, View Instances.

Configure Instances and Install Cloudera Manager

These steps are for RedHat 7.3 (user_name= ec2-user). No matter the distribution, always:
  • Disable SE Linux.
  • Disable transparent huge page compaction.
  • Set swappiness to 10.

Instructions are below. Also see Connecting to Your Linux Instance Using SSH.

Run on all instances in the cluster

Update settings and reboot instances for them to take effect.
  1. Log on to each EC2 instance from a terminal:
    chmod 400 <private_key>.pem
    ssh -i <private_key>.pem user_name@<public_dns_name>
    sudo su -
  2. Update yum and install wget:
    yum -y update
    yum -y install wget
  3. Set swappiness to 10 by editing /etc/sysctl.conf:
    1. Run for this shell:
      sysctl -w vm.swappiness=10
    2. Append property to /etc/sysctl.conf:
      vi /etc/sysctl.conf
      vm.swappiness=10

      To check the status: sysctl -n vm.swappiness.

  4. Disable transparent huge page compaction by editing /etc/rc.local:
    1. Run on each instance:
      echo never > /sys/kernel/mm/transparent_hugepage/defrag
      echo never > /sys/kernel/mm/transparent_hugepage/enabled
    2. Append commands to /etc/rc.local and change permissions:
      vi /etc/rc.local
      echo never > /sys/kernel/mm/transparent_hugepage/defrag
      echo never > /sys/kernel/mm/transparent_hugepage/enabled
      chmod 755 /etc/rc.d/rc.local
      source /etc/rc.local
  5. Disable SE Linux by editing /etc/selinux/config and rebooting the instance:
    vi /etc/selinux/config
    SELINUX=disabled
    reboot

    To check the status: sestatus

Run on one instance only

Install Cloudera Manager and its dependencies. Create a small script or run the commands individually:
vi install_cm.sh
#!/bin/bash

## Download the Cloudera Manager repository for the latest release (on your OS/ver):
wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo -P /etc/yum.repos.d/

## Install Cloudera Manager and dependencies: 
yum install -y oracle-j2sdk1.7
yum install -y cloudera-manager-daemons cloudera-manager-server
yum install -y cloudera-manager-server-db-2

## Start servers:
service cloudera-scm-server-db start
service cloudera-scm-server start
chmod 744 install_cm.sh
./install_cm.sh

Install CDH and Hue with Cloudera Manager

Follow the wizard defaults for a simple installation. Less intuitive areas are explained below.
  1. Point a browser to: http://<public dns>.<region>.compute.amazonaws.com:7180.
  2. After a minute or two, log on as admin / admin.
  3. Accept the user agreement and continue until you reach the wizard.
Welcome Steps
  1. End User License Terms and Conditions.
  2. Which edition do you want to deploy? >> Select Cloudera EnterpriseData Hub Edition Trial
  • Thank you for choosing Cloudera Manager and CDH.
  • Specify hosts for your CDH cluster installation. >> Input comma-separated hostnames.
Cluster Installation (7 steps)
  1. Select Repository. >> For the latest release in parcels, keep the defaults.
  2. JDK Installation Options. >> Check both boxes.
  3. Enable Single User Mode. >> Ignore single user mode if possible.
  4. Provide SSH login credentials. >> Set user = ec2-user & upload <private_key>.pem.
  5. Installation in progress.
  6. Installing Selected Parcels.
  7. Inspect hosts for correctness and click Finish. >> Repair issues as necessary.
Cluster Setup (6 steps)
  1. Choose the CDH 5 services that you want to install on your cluster. >> Select Core services with Impala.
  2. Customize Role Assignments >> Add 2 Zookeeper roles (for a total ensemble of 3). See Designing a ZooKeeper Deployment.
  3. Database Setup >> Use Embedded Database (Postgres). Copy Hue password for safekeeping.
  4. Review Changes.
  5. First Run Command.
  6. Click Finish.