Post-Creation Scripts

There are three kinds of post-creation scripts, depending on whether they are for the Cloudera Manager deployment, for a CDH cluster as a whole, or for each instance in a CDH cluster. The scripts can be written in any scripting language that can be interpreted on the system where it runs.

Each type of post-creation script can be specified either by embedding scripts in the configuration file or by including paths to script files on the local filesystem:
  • Embedding in the configuration file: For deployment post-creation scripts, include your script in the postCreateScripts section within the cloudera-manager {} configuration block. For cluster post-creation scripts, include your script in the postCreateScripts or instancePostCreateScripts section within the cluster {} configuration block. These blocks can take an array of scripts, similar to the bootstrapScript that can be placed inside the instance {} configuration block.
  • Include paths to files on the local filesystem: For deployment post-creation scripts, include the path to a script in the postCreateScriptsPaths section within the cloudera-manager {} configuration block. For cluster post-creation scripts, include your script in the postCreateScripts or instancePostCreateScripts section within the cluster {} configuration block. You can provide an array of paths to arbitrary files on the local filesystem. This is similar to the bootstrapScriptPath directive. Cloudera Director reads the files from the filesystem and uses their contents as post-creation scripts.

Post-creation scripts are available through the configuration file or the Cloudera Director API, but not through the Cloudera Director web UI.

Deployment Post-creation Scripts

Deployment-level post-creation scripts run as root on the Cloudera Manager instance when Cloudera Manager deployment is completed. They are configured in the cloudera-manager section of the configuration file in the section postCreateScripts.

Deployment-level post-creation scripts can be used to customize the Cloudera Manager instance after a cluster has been created, for example, to add a package or modify a file on the Cloudera Manager instance.

Multiple post-creation scripts can be supplied. They will run in the order they are listed in the configuration file.

The following code block is an excerpt from the reference configuration file on the Cloudera github site:

cloudera-manager {
...
    postCreateScripts: ["""#!/bin/sh

# This is an embedded post-creation script that runs as root and can be used to
# customize the Cloudera Manager instance after the deployment has been created.

# If the exit code is not zero Cloudera Director will fail

# Post-creation scripts also have access to the following environment variables:

#    DEPLOYMENT_HOST_PORT
#    ENVIRONMENT_NAME
#    DEPLOYMENT_NAME
#    CM_USERNAME
#    CM_PASSWORD

echo 'Hello World!'
exit 0
    """,
    """#!/usr/bin/python

# Additionally, multiple post-creation scripts can be supplied.  They will run
# in the order they are listed here.  Interpeters other than bash can be used
# as well.

print 'Hello again!'
    """]

    # For more complex scripts, post-creation scripts can be supplied via local
    # filesystem paths. They will run after any scripts supplied in the previous
    # postCreateScripts section.
    # postCreateScriptsPaths: ["/tmp/test-script.sh",
    #                         "/tmp/test-script.py"]
...
}

Cluster Post-creation Scripts

There are two types of cluster post-creation scripts:
  • A cluster-level script is run on a single arbitrary instance in the cluster.
  • An instance-level script is run on each instance in the cluster.

As with deployment-level scripts, cluster post-creation scripts can be specified either by embedding scripts in the configuration file or by including paths to script files on the local filesystem. For both instance-level and cluster-level scripts (and unlike bootstrapScript and bootstrapScriptPath), both post-creation scripting methods can be used simultaneously. For example, postCreateScripts could be used for setup (package installation, light system configuration), and postCreateScriptsPaths could be used to refer to more complex scripts that may depend on the configuration that was performed in postCreateScripts.

Instance-level and cluster-level scripts run when bootstrapping is complete and the cluster is ready. Instance-level scripts will also be run when you grow a cluster by adding instances, but will not run when instances are migrated manually. For instance-level and cluster-level scripts, where there can be multiple instancePostCreateScripts and postCreateScripts, the scripts run in the following order:
  1. Everything in the instancePostCreateScripts block is run sequentially.
  2. Everything in instancePostCreateScriptsPaths is run sequentially.
  3. Everything in the postCreateScripts block is run sequentially.
  4. Everything in postCreateScriptsPaths is run sequentially.

Cluster-level Post-creation Scripts

Cluster-level post-creation scripts run as root on a single arbitrary instance in a cluster after the cluster has been created. As with instance-level post-creation scripts, they are configured in the cluster section of the configuration file. They run after any instance post-creation scripts.

The following code block is an excerpt from the reference configuration file on the Cloudera github site:

cluster {
...
    postCreateScripts: ["""#!/bin/sh
    
# This is an embedded post-creation script that runs as root and can be used to
# customize the cluster after it has been created. This will run only once,
# at a cluster level, on an arbitrary cluster instance.

# If the exit code is not zero Cloudera Director will fail

# Post-creation scripts also have access to the following environment variables:

#    DEPLOYMENT_HOST_PORT
#    ENVIRONMENT_NAME
#    DEPLOYMENT_NAME
#    CLUSTER_NAME
#    CM_USERNAME
#    CM_PASSWORD

echo 'Hello World!'
exit 0
    """,
    """#!/usr/bin/python

# Additionally, multiple post-creation scripts can be supplied.  They will run
# in the order they are listed here.  Interpeters other than bash can be used
# as well.

print 'Hello again!'
    """]

    # For more complex scripts, post-creation scripts can be supplied via local
    # filesystem paths. They will run after any scripts supplied in the previous
    # postCreateScripts section.
    # postCreateScriptsPaths: ["/tmp/test-script.sh",
    #                         "/tmp/test-script.py"]
}

Instance-level Post-creation Scripts

Instance-level post-creation scripts run as root after a cluster has been created. They are configured in the cluster section of the configuration file.

They run before any cluster-level post-creation scripts. Instance-level post-creation scripts can be used, for example, to specify processes that have to be run separately on each instance, such as to add a package to all cluster instances or modify a file on all cluster instances.

The following code block is an excerpt from the reference configuration file on the Cloudera github site:

cluster {
...
    instancePostCreateScripts: ["""#!/bin/sh

# This is an embedded instance post-creation script that runs as root and can be used to
# customize each cluster instance after the cluster has been created. This script will run
# on every cluster instance. These scripts run before postCreateScripts, which are at cluster level.

# If the exit code is not zero Cloudera Director will fail

# Instance post-creation scripts also have access to the following environment variables:

#    DEPLOYMENT_HOST_PORT
#    ENVIRONMENT_NAME
#    DEPLOYMENT_NAME
#    CLUSTER_NAME
#    CM_USERNAME
#    CM_PASSWORD

echo 'Hello World!'
exit 0
    """,
    """#!/usr/bin/python

# Additionally, multiple instance post-creation scripts can be supplied.  They will run
# in the order they are listed here.  Interpeters other than bash can be used
# as well.

print 'Hello again!'
    """]

    # For more complex scripts, instance post-creation scripts can be supplied via local
    # filesystem paths. They will run after any scripts supplied in the previous
    # instancePostCreateScripts section.
    # instancePostCreateScriptsPaths: ["/tmp/test-script.sh",
    #                         "/tmp/test-script.py"]
...
}

Predefined Environment Variables

As noted in the code comments above, post-creation scripts have access to several environment variables defined by Cloudera Director. Use these variables in your scripts to communicate with Cloudera Manager and configure it after Cloudera Director has completed its tasks.

Deployment-level post-creation scripts do not use the cluster name variable, since they can include multiple clusters.

Variable Name Example Description
DEPLOYMENT_HOST_PORT 192.168.1.100:7180 The host and port used to connect to the Cloudera Manager deployment that this cluster belongs to.
ENVIRONMENT_NAME director_environment The name of the environment that this cluster belongs to.
DEPLOYMENT_NAME director_deployment The name of the Cloudera Manager deployment that this cluster belongs to.
CLUSTER_NAME director_cluster The name of the cluster. The Cloudera Manager API needs this to specify which cluster on a Cloudera Manager server to operate on.
CM_USERNAME admin The username needed to connect to the Cloudera Manager deployment.
CM_PASSWORD admin The password needed to connect to the Cloudera Manager deployment.