The Cloudera Director Configuration File

This section describes the configuration file used when launching a cluster through Cloudera Director client with the bootstrap command, or through the Cloudera Director server with the bootstrap-remote command. The configuration file also includes additional advanced settings that are documented in comments within the file.

File Location

To create a configuration file, install the cloudera-director-client package, open aws.simple.conf or aws.reference.conf, and save it as aws.conf. The sample configuration files are found either in /usr/lib64/cloudera-director/client or /usr/lib/cloudera-director/client, depending on the operating system you are using. Copy the sample files to your home directory before editing them.

Environment Settings

This section describes basic settings you must configure before deploying a cluster.

Setting Type Required Description

name

string

yes

Specifies the name of the cluster in Cloudera Manager.

Example: C5-Reference-AWS

Default: none

provider

container

yes

Container for the cloud infrastructure provider.

id

string

yes

The ID of the cloud infrastructure provider; leave this set to aws.

Example: aws

Valid Values: aws

Default: aws

accessKeyId

string

yes

The access key used to make AWS requests. Make sure the value is enclosed in double quotes.

Example: RQU1JC3XKTTYYJTXDR

Valid Values: Valid AWS access key.

Default: none

secretAccessKey

string

yes

The secret access key used to make AWS requests. Make sure the value is enclosed in double quotes.

Example: vVdg/y4ArVuxdlzZoO06139xTSd5V5S8

Valid Values: Valid AWS secret key.

Default: none

publishAccessKeys

boolean

no

Specifies whether Cloudera Director automatically publishes your credentials as cluster configurations for Amazon S3 access.

Example: true

Valid Values: true | false

Default: false

region

string

yes

The region in which to launch the cluster.

Example: us-west-2

Valid Values: See Availability Zones .

Default: none

regionEndpoint

string

no

Specifies the region endpoint for clusters launched in the .gov region. If you are not launching in the .gov region, leave this commented out.

Example: ec2.us-gov-west-1.amazonaws.com

Valid Values: any valid .gov region.

Default: none

keyName

string

yes

The name of the key pair used to start the cluster launcher.

Example: my-cloudera-keypair

Valid Values: any valid key pair associated with the region.

Default: none

subnetId

string

yes

ID of the subnet that you noted earlier.

Example: subnet-5b818f1d

Valid Values: any valid subnet ID in the region.

Default: none

securityGroupsIds

string

yes

ID of the security group that you noted earlier. Use the ID of the group, not the name (for example, sg-b139d3d3, not default). To specify more than one security group, separate them with commas and enclose the string with quotes.

Example: sg-b139d3d3

Valid Values: any valid security group ID in the region.

Default: none

instanceNamePrefix

string

yes

The prefix used to launch instances. This prefix is part of the instance name which you can use to find instances started by Cloudera Director in the AWS Console.

Example: skynet-cluster-1

Valid Values: any string

Default: none

rootVolumeSizeGB

integer

yes

Sets the size of the root volume for the cluster launcher.

Example: 100

Default: 50

associatePublicIpAddresses

boolean

no

Specifies whether nodes will have public IP addresses. To optimize Amazon S3 data transfer performance, set this to true.

Example: true

Valid Values: true | false

Default: false

image

string

yes

Specifies the AMI to use. Cloudera recommends Red Hat Enterprise Linux 6.4 (64bit). To find the correct AMI for the selected region, visit the Red Hat AWS Partner Page.

Example: ami-22558833

Valid Values: Any valid AMI running Enterprise Enterprise Linux 6.4 (64bit)

Default: none

Note: For more information about AMI selection, see Choosing an AMI.

ssh

container

yes

Container for SSH settings.

username

string

yes

Specifies the username for SSH access to the instances.

Example: ec2-user

Default: none

privateKey

string

yes

Specifies the location of the SSH private key.

Example: ${?HOME}/.ssh/director_id_rsa

Default: none

Instance Settings

This section describes settings that define instances. Once defined, you can launch these instance types for Cloudera Manager and nodes in the cluster.

Setting

Type

Required

Description

instances

container

yes

The container that specifies instance settings.

instance_type

container

yes

A container that specifies settings for a type of instance to launch. You can specify any string value. For example, you can create an instance type called “cm” that uses an m1.large instance and another instance type called “node” that uses an m1.xlarge instance.

type

string

yes

The type of instances to launch.

Example: m3.2xlarge

Valid Values: Any valid instance name. For a list of valid instance types, go to Instance Types.

Default: none

bootstrapScript

string

yes

Linux shell script that executes whenever a cluster instance reboots.

After the instance boots, this script automatically runs. This script can contain anything you need for your environment including libraries, monitoring tools, security configurations, and so on.

Example: """#!/bin/sh

# This is an embedded bootstrap script that runs

# as root and can be used to customize

# the instances immediately after boot and before   # any other Cloudera Director action

# If the exit code is not zero Cloudera Director will  

# automatically retry

echo 'Hello World!'

exit 0

"""

Valid Values: any valid script

Default: none

tags

container

yes

Container for any tags to apply to the instances. These tags can be used to find your instances in the AWS Console or on your AWS invoice.

tag

string

yes

Specifies the name and value of the tag.

Example: department: “Data Science”

Valid Values: Any valid name/value pair.

Default: none

Cloudera Manager Settings

This section describes settings for the Cloudera Manager instance.

Setting

Type

Required

Description

cloudera-manager

container

yes

The container for Cloudera Manager settings.

instance

string

yes

Specifies the instance type to use that you defined in Instance Settings.

Example: ${instances.cm}

Valid Values: any instance type that you defined earlier.

Default: none

tags

container

yes

Container for any tags to apply to the Cloudera Manager instance.

tag

string

yes

Specifies the name and value of the tag.

Example: application: “Cloudera Manager 5”

Valid Values: Any valid name/value pair.

Default: none

customBannerText

string

no

Specifies custom banner text to display in Cloudera Manager.

Example: “Managed by Cloudera Director”

Valid Values: any valid string

Default: none

enableEnterpriseTrial

boolean

no

When set to true, automatically enables a 60-day Cloudera Enterprise trial.

Example: true

Valid Values: true | false

Default: false

Database Settings

This section describes settings for configuring external databases. This section is optional. If no settings are specified, Cloudera Director uses the embedded PostgreSQL database.

Setting

Type

Required

Description

databases

container

no

The container for databases.

CLOUDERA_MANAGER

container

no

The container for the database used by Cloudera Manager.

ACTIVITYMONITOR

container

no

The container for the database used by the activity monitor.

REPORTSMANAGER

container

no

The container for the database used by the reports manager.

NAVIGATOR

container

no

The container for the database used by Navigator.

type

string

no

The type of database.

Example: postgresql

Valid Values: postgresql | mysql.

Default: none

Note: Cloudera currently provides PostgreSQL drivers. Drivers for other databases must be added with the bootstrap script.

host

string

no

The database host.

Example: db.example.com

Default: none

port

string

no

The database port.

Example: 123

Default: none

user

string

no

A database user.

Example: dbuser

Default: none

password

string

no

The password of the database user.

Example: Pa$$word

Default: none

name

string

no

The name of database.

Example: cmdb

Default: none

Cluster Settings

This section describes products and services to launch on instances in the cluster.

Setting

Type

Required

Description

cluster

container

yes

The container for the cluster.

products

container

yes

The container for products to launch.

CDH

string

no

The version of CDH to launch.

Example: 5

Valid Values: 4 | 5

Default: 4

IMPALA

string

yes

The version of Impala to launch.

Example: 1.2

Default: none

services

array

yes

An array of services to launch. Options include:

Example: [HDFS, YARN, ZOOKEEPER, HBASE, HIVE, HUE, OOZIE]

Valid Values: HBASE, HDFS, HIVE, HUE, IMPALA, KS_INDEXER, MAPREDUCE, OOZIE, SOLR, SPARK, SQOOP, YARN, and ZOOKEEPER.

Default: none

HIVE

container

no

The container for the database used by Hive. All Hive database settings are commented out by default.

type

string

no

The type of Hive database.

Example: postgresql

Valid Values: postgresql | mysql.

Default: none

host

string

no

The Hive database host.

Example: db.example.com

Default: none

port

string

no

The Hive database port.

Example: 123

Default: none

user

string

no

A database user for Hive.

Example: dbuser

Default: none

password

string

no

The password of the database user.

Example: Pa$$word

Default: none

name

string

no

The name of Hive database.

Example: cmdb

Default: none

masters

container

yes

The container for service masters.

count

integer

yes

The number of instances to launch.

instance

string

yes

Specifies the instance type to use that you defined in Instance Settings.

Example: ${instances.nodes}

Valid Values: any instance type that you defined earlier.

Default: none

tags

container

yes

Container for any tags to apply to the instances.

tag

string

yes

Specifies the name and value of the tag.

Example: group: master

Valid Values: Any valid name/value pair.

Default: none

roles

container

yes

Container for roles.

role

string

yes

Specifies the roles to apply to the masters.

Example:

HDFS: ${roles.HDFS_MASTERS}

YARN: ${roles.YARN_MASTERS}

ZOOKEEPER: ${roles.ZOOKEEPER_MASTERS}

HBASE: ${roles.HBASE_MASTERS}

HIVE: ${roles.HIVE_MASTERS}

HUE: ${roles.HUE_MASTERS}

OOZIE: ${roles.OOZIE_MASTERS}

Default: none

workers

container

yes

Container for workers to launch.

count

integer

yes

The number of instances to launch.

instance

string

yes

Specifies the instance type that you defined in Instance Settings.

Example: ${instances.nodes}

Valid Values: any instance type that you defined earlier.

Default: none

tags

container

yes

Container for any tags to apply to the instances.

tag

string

yes

Specifies the name and value of the tag.

Example: group: master

Valid Values: Any valid name/value pair.

Default: none

roles

container

yes

Container for roles.

role

string

yes

Specifies the roles to apply to the masters.

Example:

HDFS: ${roles.HDFS_MASTERS}

YARN: ${roles.YARN_MASTERS}

HBASE: ${roles.HBASE_MASTERS}

Default: none

placementGroup

string

yes

Specifies the placement group in which to launch the instance. For more information, see Placement Groups.

gateways

container

yes

Container for gateways to launch.

Note: Although this container is called gateways, containers at this level can use any name to launch a set of instances with shared instance settings and roles.

count

integer

yes

The number of instances to launch.

instance

string

yes

Specifies the instance type that you defined in Instance Settings.

Example: ${instances.nodes}

Valid Values: any instance type that you defined earlier.

Default: none

tags

container

yes

Container for any tags to apply to the instances.

tag

string

yes

Specifies the name and value of the tag.

Example: group: master

Valid Values: Any valid name/value pair.

Default: none

roles

container

yes

Container for roles.

role

string

yes

Specifies the roles to apply to the masters.

Example:

HIVE: ${roles.HIVE_MASTERS}

Default: none