Altus Data Engineering Clusters

You can use the Cloudera Altus console or the command-line interface to create and manage Altus Data Engineering clusters. The Altus Data Engineering service provisions single-user, transient clusters.

By default, the Altus Data Engineering service creates a cluster that contains a master node and multiple worker nodes. The Altus Data Engineering service also creates a Cloudera Manager instance to manage the cluster. The Cloudera Manager instance provides visibility into the cluster but is not a part of the cluster. You cannot use the Cloudera Manager instance as a gateway node for the cluster.

Cloudera Manager configures the master node with roles that give it the capabilities of a gateway node. The master node has a resource manager, Hive server and metastore, Spark service, and other roles and client configurations that essentially turns the master node into a gateway node. You can use the master node as a gateway node in an Altus Data Engineering cluster to run Hive and Spark shell commands and Hadoop commands.

The Altus Data Engineering service creates a read-only user account to connect to the Cloudera Manager instance. When you create a cluster on the Altus console, specify the user name and password for the read-only user account. Use the user name and password to log in to Cloudera Manager.

When you create a cluster using the CLI and you do not specify a user name and password, the Altus Data Engineering service creates a guest user account with a randomly generated password. You can use the guest user name and password to log in to Cloudera Manager.

For more information about the guest user account generated through the CLI, see Cloudera Manager Connection.

Altus appends tags to each node in a cluster. You can use the tags to identify the nodes and the cluster that they belong to.

For more information about the tags, see Altus Tags.

When you create an Altus Data Engineering cluster, you specify which service runs in the cluster. Select the service appropriate for the type of job that you plan to run on the cluster.

The following list describes the services available in Altus clusters and the types of jobs you can run with each service:
Service Type Job Type
Hive Hive
Hive on Spark Hive
Spark 2.x Spark or PySpark
Spark 1.6 Spark or PySpark
MapReduce2 MapReduce2
Multi Hive, Spark, PySpark, MapReduce2

The Multi service cluster supports Spark 2.x. It does not support Spark 1.6.

Cluster Status

A cluster periodically changes status from the time that you create it until the time it is terminated.

An Altus cluster can have the following statuses:
  • Creating. The cluster creation process is in progress.
  • Created. The cluster was successfully created.
  • Failed. The cluster can be in a failed state at creation or at termination time. View the failure message to get more information about the failure.
  • Terminating. The cluster is in the process of being terminated.

    When the cluster is terminated, it is removed from the list of clusters displayed in the Clusters page on the console. It is also not included in the list of clusters displayed when you run the list-clusters command.