Creating a Cloudera Manager AMI

In a larger production environment, you can optimize instance start times by creating an Amazon machine image (AMI). Create this AMI after you complete the initial installation of Cloudera Manager (for CentOS/RHEL, adding the appropriate repo file to the /etc/yum.repos.d/ directory, yum install cloudera-scm-server).
  1. Download the appropriate parcel file. For example: http://archive.cloudera.com/cdh5/parcels/5.3.0/CDH-5.3.0-1.cdh5.3.0.p0.30-wheezy.parcel
  2. Copy the parcel file, depending on your Cloudera Director installation.
    • For Cloudera Director Server
      • Copy the parcel to the /opt/cloudera/parcel-repo directory. Create this directory if it does not exist.
      • Calculate the SHA hash for the AMI and place it in /opt/cloudera/parcel-repo. Name the file the same as the parcel, but add ".sha" to the filename.
    • For Cloudera Director Client:
      • Copy the parcel to the /opt/cloudera/parcel-cache directory. Create this directory if it does not exist.
  3. Make sure that the /opt/cloudera directory and its sub-directories are owned by cloudera-scm. Add the cloudera-scm user if it does not already exist.

The following example script completes all the steps.

#!/usr/bin/env bash
#
# Copyright (c) 2014 Cloudera, Inc. All rights reserved.
#

# For this script to work properly, you need to supply a URL to a parcel file,
# e.g. http://archive.cloudera.com/cdh5/parcels/5.3.0/CDH-5.3.0-1.cdh5.3.0.p0.30-wheezy.parcel

# You can do this one of two ways:
# 1. Set a PARCEL_URL environment variable.
# 2. Supply an argument that is a PARCEL_URL.

# This script will have to be re-run for each parcel you want to cache on the
# image that you are building.

if [ -z ${PARCEL_URL+set} ]
then
  if [ "$#" -ne 1 ]
  then
    echo "Usage: $0 <parcel-url>"
    echo ""
    echo "Alternatively, set the environment variable PARCEL_URL prior to"
    echo "running this script."
    exit 1
  else
    PARCEL_URL=$1
  fi
fi

sudo useradd -r cloudera-scm
sudo mkdir -p /opt/cloudera/parcels /opt/cloudera/parcel-repo /opt/cloudera/parcel-cache

PARCEL_NAME="${PARCEL_URL##*/}"

echo "Downloading parcel from $PARCEL_URL"
sudo curl "${PARCEL_URL}" -o "/opt/cloudera/parcel-repo/$PARCEL_NAME"
sudo curl "${PARCEL_URL}.sha1" -o "/opt/cloudera/parcel-repo/$PARCEL_NAME.sha"

sudo cp /opt/cloudera/parcel-repo/*.parcel /opt/cloudera/parcel-cache

sudo chown -R cloudera-scm.cloudera-scm /opt/cloudera