Altus Data Warehouse Clusters on the Console

You can create a Data Warehouse cluster and view the status and configuration of the cluster on the Cloudera Altus Console.

Creating a Data Warehouse Cluster for AWS

To create a Data Warehouse cluster on the console:
  1. Sign in to the Cloudera Altus console:

    https://console.altus.cloudera.com/

  2. On the side navigation panel, go to the Data Warehouse section and click Clusters.

    By default, the Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can filter the list by environment and status. You can also search for clusters by name.

  3. Click Create Cluster.
  4. In the General Information section, specify the following information:
    Property Description
    Cluster Name The name to identify the cluster that you are creating. The cluster name is an alphanumeric string of any length. It can include dashes (-) and underscores (_). It cannot include a space.
    CDH Version The CDH version that the cluster will use.

    Select the CDH version you want to user for the Data Warehouse cluster.

    CDH 6.1
    You can use a CDH 6.1 cluster only with a configured SDX namespace that points to version 6.1 of the Hive metastore and Sentry databases.
    CDH 5.x
    You can use a CDH 5.x cluster only with a configured SDX namespace that points to version 5.x of the Hive metastore and Sentry databases.
    SDX Namespace The SDX namespace to use for the cluster.

    You must set up the SDX namespace before you create the cluster. You can use the same namespace for multiple clusters.

    Environment Name of the Altus environment that describes the resources to be used for the cluster. The Altus environment specifies the network and instance settings for the cluster.

    If you do not know which Altus environment to select, check with your Altus administrator.

  5. In the Node Configuration section, specify the number of worker nodes to create and the instance type to use for the cluster.
    Property Description
    Worker

    Executes SQL queries sent to the Data Warehouse cluster. Worker nodes do not act as a coordinator for Data Warehouse queries.

    You can configure the following properties for the worker node:
    Instance Type
    Select the instance type from the list of supported instance types.

    Default: r4.2xlarge (61 GB 8 vCPUs)

    Number of Nodes
    Select the number of worker nodes to include in the cluster. A cluster must have a minimum of 3 worker nodes.

    Default: 5

    EBS Storage
    Click the Edit icon to view or modify the EBS storage configuration of the worker node. In the EBS Volume Configuration window, configure the following properties for the EBS Volume:
    • Storage Type. Select the EBS volume type best suited for the database you set up in the cluster.
    • Storage Size. Set the storage size of the EBS volume expressed in gibibyte (GiB).
    • Volumes per Instance. Set the number of EBS volumes for each instance in the worker node. All EBS volumes are configured with the same volume size and type.
    If you do not configure the EBS volumes, Altus sets the optimum configuration for the EBS volumes based on the CDH version and instance type.

    For more information about Amazon EBS, see Amazon EBS Product Details on the AWS website.

    Coordinator

    Generates query plans and coordinates the execution of queries sent to the Data Warehouse cluster. The coordinator node receives the query results from the worker nodes and constructs the final result set for the query.

    Altus configures the coordinator node for the cluster. You cannot modify the configuration of the coordinator node

    By default, Altus sets the following configuration for the coordinator node:
    Instance Type
    r4.2xlarge (61 GB 8 vCPUs)
    Number of Nodes
    1
    EBS Storage
    Altus sets the optimum configuration for the EBS volumes based on the CDH version and instance type.

    For more information about Amazon EBS, see Amazon EBS Product Details on the AWS website.

    Master Altus configures the master node for the cluster. You cannot modify the master node configuration.
    By default, Altus sets the following configuration for the master node:
    Instance Type
    r4.2xlarge (61 GB 8 vCPUs)
    Number of Nodes
    1
    EBS Storage
    Altus sets the optimum configuration for the master node based on the service type and instance type.
    Cloudera Manager Altus configures the Cloudera Manager instance for the cluster. You cannot modify the Cloudera Manager instance configuration.
    By default, Altus sets following configuration for the Cloudera Manager instance:
    Instance Type
    c4.2xlarge (15 GB 8 vCPUs)
    Number of Nodes
    1
    EBS Storage
    Altus sets the optimum configuration for the Cloudera Manager node based on the service type and instance type.
  6. In the Credentials section, provide the credentials for the user account to log in to Cloudera Manager.
    Property Description
    Public SSH Key You use an SSH key to access instances in the cluster that you are creating. You can provide a public key that Altus will add to the authorized_keys file on each node in the cluster. To connect to the cluster through SSH, use the private key that corresponds to the public key.

    Select File Upload to upload a file that contains the public key or select Direct Input to enter the full key code.

    If you select Skip and you do not provide an SSH public key, you cannot access the cluster through SSH or access the Cloudera Manager instance through a SOCKS proxy.

    For more information about connecting to Altus clusters through SSH, see SSH Connection.

    Cloudera Manager Access Altus creates a read-only user account that you can use to o access the Cloudera Manager instance in the cluster. You can allow Altus to generate the user name and password for the user account or you can specify the user name and password for the account.

    To allow Altus to generate the credentials, select Auto-generate. After you click Create Cluster, Altus displays a window with the user name and password for the Cloudera Manager instance. Save the credentials before you close the window.

    To specify the user credentials, click Customize. Specify the user name and password for the user account and then confirm the password. Take note of the user name and password that you specify for the Cloudera Manager user account.

  7. In the Advanced Settings section, set the following optional properties:
    Property Description
    Instance bootstrap script Bootstrap script that is executed on all the cluster instances immediately after start-up before any service is configured and started. You can use the bootstrap script to install additional OS packages or application dependencies.

    You cannot use the bootstrap script to change the cluster configuration.

    Select File Upload to upload a script file or select Direct Input to type the script on the screen.

    The bootstrap script must be a local file. It can be in any executable format, such as a Bash shell script or Python script. The size of the script cannot be larger than 4096 bytes.

    Resource Tags Tags that you define and that you want Altus to append to the cluster that you are creating. Altus appends the tags you define to the nodes and resources associated with the cluster.

    You create the tag as a name-value pair. Click + to add a tag name and set the value for that tag. Click - to delete a tag from the list.

    By default, Altus appends tags to the cluster instance to make it easy to identify nodes in a cluster. When you define tags for the cluster, Altus adds your tags in addition to the default tags.

    For more information about the tags that Altus appends to the cluster, see Altus Tags.

  8. Verify that all required fields are set and click Create Cluster.

    The Data Warehouse service creates a CDH cluster with the configuration you set. On the Clusters page, the new cluster displays at the top of the list of clusters.

Creating a Data Warehouse Cluster for Azure

To create an Altus Data Warehouse cluster on the console:
  1. Sign in to the Cloudera Altus console:

    https://console.altus.cloudera.com/

  2. On the side navigation panel, go to the Data Warehouse section and click Clusters.

    By default, the Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can filter the list by environment and status. You can also search for clusters by name.

  3. Click Create Cluster.
  4. In the General Information section, specify the following information:
    Property Description
    Cluster Name The name to identify the cluster that you are creating. The cluster name is an alphanumeric string of any length. It can include dashes (-) and underscores (_). It cannot include a space.
    CDH Version The CDH version that the cluster will use.

    Select the CDH version you want to user for the Data Warehouse cluster.

    CDH 6.1
    • You can use a CDH 6.1 cluster only with a configured SDX namespace that points to version 6.1 of the Hive metastore and Sentry databases.
    • For clusters with CDH 6.1, Altus archives logs to ADLS Gen1 or Gen2, based on the folder you specify.
    CDH 5.x
    • You can use a CDH 5.x cluster only with a configured SDX namespace that points to version 5.x of the Hive metastore and Sentry databases.
    • For clusters with CDH 5.x, Altus archives logs to ADLS Gen1.
    SDX Namespace The SDX namespace to use for the cluster.

    You must set up the SDX namespace before you create the cluster. You can use the same namespace for multiple clusters.

    Environment Name of the Altus environment that describes the resources to be used for the cluster. The Altus environment specifies the network and instance settings for the cluster.

    If you do not know which Altus environment to select, check with your Altus administrator.

  5. In the Node Configuration section, specify the number of worker nodes to create and the instance type to use for the cluster.
    Property Description
    Worker

    Executes SQL queries sent to the Data Warehouse cluster. Worker nodes do not act as a coordinator for Data Warehouse queries.

    You can configure the following properties for the worker node:
    Instance Type
    Select the instance type from the list of supported instance types.

    Default: STANDARD_DS12_V2 (28 GB 4 vCPUs)

    If you set the instance type for the worker node, Altus sets coordinator and master nodes to the same instance type.

    Number of Nodes
    Select the number of worker nodes to include in the cluster.

    Default: 5

    Disk Configuration
    Click the Edit icon to view or modify the disk configuration of the worker node. In the Disk Configuration window, configure the following properties for the worker node instance:
    • Storage Type. Select the storage type best suited for the database you plan to set up in the cluster.
    • Storage Size. Set the storage size of the disk expressed in gibibyte (GiB).
    • Disks per Instance. Set the number of disks for each instance in the worker node.
    If you do not set the disk configuration, Altus sets the optimum configuration for the worker node instance based on the CDH version and instance type.
    Coordinator

    Generates query plans and coordinates the execution of queries sent to the Data Warehouse cluster. The coordinator node receives the query results from the worker nodes and constructs the final result set for the query.

    Altus configures the coordinator node for the cluster. You cannot modify the configuration of the coordinator node

    Altus sets the following default configuration for the coordinator node:
    Instance Type
    STANDARD_DS12_V2 (28 GB 4 vCPUs)

    If you change the instance type for the worker node, Altus sets the coordinator node to the same instance type as the worker node.

    Number of Nodes
    1
    Disk Configuration
    Altus sets the optimum configuration for the coordinator node based on the CDH version and instance type.
    Master Altus configures the master node for the cluster. You cannot modify the master node configuration.
    Altus sets the following default configuration for the master node:
    Instance Type
    STANDARD_DS12_V2 (28 GB 4 vCPUs)

    If you change the instance type for the worker node, Altus sets the master node to the same instance type as the worker node.

    Number of Nodes
    1
    Disk Configuration
    Altus sets the optimum configuration for the master node based on the service type and instance type.
    Cloudera Manager Altus configures the Cloudera Manager instance for the cluster. You cannot modify the Cloudera Manager instance configuration.
    Altus sets following configuration for the Cloudera Manager instance:
    Instance Type
    STANDARD_DS12_V2 (28 GB 4 vCPUs)
    Number of Nodes
    1
    Disk Configuration
    Altus sets the optimum configuration for the Cloudera Manager instance based on the service type and instance type.
  6. In the Credentials section, provide the credentials for the user account to log in to Cloudera Manager.
    Property Description
    Public SSH Key You use an SSH key to access instances in the cluster that you are creating. You can provide a public key that Altus will add to the authorized_keys file on each node in the cluster. To connect to the cluster through SSH, use the private key that corresponds to the public key.

    Select File Upload to upload a file that contains the public key or select Direct Input to enter the full key code.

    If you select Skip and you do not provide an SSH public key, you cannot access the cluster through SSH or access the Cloudera Manager instance through a SOCKS proxy.

    For more information about connecting to Altus clusters through SSH, see SSH Connection.

    Cloudera Manager Access Altus creates a read-only user account that you can use to o access the Cloudera Manager instance in the cluster. You can allow Altus to generate the user name and password for the user account or you can specify the user name and password for the account.

    To allow Altus to generate the credentials, select Auto-generate. After you click Create Cluster, Altus displays a window with the user name and password for the Cloudera Manager instance. Save the credentials before you close the window.

    To specify the user credentials, click Customize. Specify the user name and password for the user account and then confirm the password. Take note of the user name and password that you specify for the Cloudera Manager user account.

  7. In the Advanced Settings section, set the following optional properties:
    Property Description
    Instance bootstrap script Bootstrap script that is executed on all the cluster instances immediately after start-up before any service is configured and started. You can use the bootstrap script to install additional OS packages or application dependencies.

    You cannot use the bootstrap script to change the cluster configuration.

    Select File Upload to upload a script file or select Direct Input to type the script on the screen.

    The bootstrap script must be a local file. It can be in any executable format, such as a Bash shell script or Python script. The size of the script cannot be larger than 4096 bytes.

    Resource Tags Tags that you define and that you want Altus to append to the cluster that you are creating. Altus appends the tags you define to the nodes and resources associated with the cluster.

    You create the tag as a name-value pair. Click + to add a tag name and set the value for that tag. Click - to delete a tag from the list.

    By default, Altus appends tags to the cluster instance to make it easy to identify nodes in a cluster. When you define tags for the cluster, Altus adds your tags in addition to the default tags.

    For more information about the tags that Altus appends to the cluster, see Altus Tags.

  8. Verify that all required fields are set and click Create Cluster.

    The Data Warehouse service creates a CDH cluster with the configuration you set. On the Clusters page, the new cluster displays at the top of the list of clusters.

Getting the Cluster Credentials on the Console

You connect to an Altus Data Warehouse cluster through the coordinator node. On the Altus console, you can display the IP address of the coordinator node in a cluster.

When you create a secure Data Warehouse cluster, Altus generates a user ID and a password that is unique to the cluster. Access to a secure cluster from a client tool requires the user ID and password. When you set up a connection to a Data Warehouse cluster from your client tool using ODBC or an Impala JDBC Connector version older than 2.6.9, the connection properties must include the IP address of the coordinator node and the user name and password for the cluster.

To get the user ID and password of a cluster on the console:
  1. Sign in to the Cloudera Altus console:

    https://console.altus.cloudera.com/

  2. On the side navigation panel, go to the Data Warehouse section and click Clusters.

    By default, the Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can filter the list by environment and status. You can also search for clusters by name.

  3. Click the name of the cluster that you want to access.

    On the Cluster details page, review the cluster information to verify that it is the cluster that you want to access.

  4. Click Actions and select Display Access Information.

    If the cluster is secure, the Cluster Access Information window displays the user name and password for the cluster and the Coordinator Endpoint field with the IP address of the coordinator node. You need the user name and password to access the cluster.

    If the cluster is not secure, the Cluster Access Information window displays the Coordinator Endpoint field with the IP address of the coordinator node. You can access the cluster without a user name or password.

  5. Copy the cluster credentials and coordinator node IP address and use them when you set up access to the cluster from a client tool.

Viewing the Cluster Status

To view information about Altus clusters on the console:
  1. Sign in to the Cloudera Altus console:

    https://console.altus.cloudera.com/

  2. On the side navigation panel, go to the Data Warehouse section and click Clusters.

    By default, the Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can filter the list by environment and status. You can also search for clusters by name.

    The Clusters list shows the following information:
    • Cluster name
    • Status

      For more information about the different statuses that a cluster can have, see Cluster Status.

    • Version of CDH that runs in the cluster.
    • Number of worker nodes
    • Date and time the cluster was created in Altus
    • Instance type for the cluster
  3. You can click the Actions button for a cluster to perform the following tasks:
    • Clone Cluster. To create a cluster of the same type and characteristics as the cluster that you are viewing, select the Clone Cluster action. On the Create Cluster page, you can create a cluster with the same properties as the cluster you are cloning. You can modify or add to the properties before you create the cluster.
    • Delete Cluster. To terminate a cluster, select the Delete Cluster action for the cluster you want to terminate.
  4. To view the details of a cluster, click the name of the cluster you want to view.

    The Cluster Details page displays information about the cluster in more detail, including the configuration of the cluster nodes and Cloudera Manager instance.

Viewing the Cluster Details

To view the details of a Data Warehouse cluster on the console:
  1. Sign in to the Cloudera Altus console:

    https://console.altus.cloudera.com/

  2. On the side navigation panel, go to the Data Warehouse section and click Clusters.

    By default, the Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can filter the list by environment and status. You can also search for clusters by name.

  3. Click the name of a cluster.
    The details page for the selected cluster displays the status of the cluster and the following information:
    Cluster Status
    The details page displays information appropriate for the status of the cluster. For example, if a cluster failed, the details page displays the failure message but does not display a link to the Cloudera Manager instance.
    Write Queries
    The Write Queries link takes you to the Query Editor page. On the Query Editor page, you can create queries using the Impala SQL query engine to analyze data in your Altus Data Warehouse cluster.
    Cloudera Manager Configuration
    The Cloudera Manager Configuration section displays the instance type and connection details for the Cloudera Manager instance.

    The cluster details page displays the private IP address assigned to the Cloudera Manager instance in the cluster. If the Public IPs option for the environment used to create the cluster is enabled, the page also displays the public IP addresses. You can log in to Cloudera Manager through the public or private IP. If the public IP addresses are available, you can click a link to view the Altus command to set up a SOCKS proxy server to access the Cloudera Manager instance in the cluster.

    The Cloudera Manager Configuration section appears only if the Cloudera Manager instance is accessible. The Cloudera Manager instance might not be accessible when the cluster status is Creating or when the cluster failed at creation time.

    Node Configuration
    The Node Configuration section displays the configuration of the master node, coordinator node, and worker nodes in the cluster, including the instance types and number of nodes. For clusters in AWS, the sections shows the EBS volume configuration. For clusters in Azure, the section shows the storage configuration and number of disks per instance.
    Cluster Details
    • Log Archive Location shows where the cluster and job logs are archived.
    • Uses instance bootstrap script? shows whether a bootstrap script runs before cluster startup.
    • Security shows whether the cluster is secure or not, based on the setting for the Secure Clusters option in the environment.
    • Resource Tags shows the resource tags set up for the cluster.
    Cluster summary and other information
    • Creation Time shows the time when a user created the cluster in Altus.
    • If the cluster is a secure cluster, the User Sync Status shows the last time that the list of Altus users and groups who have access to the cluster was synchronized with the Data Warehouse cluster. If the list of Altus users and groups have not yet been synchronized with the cluster, the section displays the status Never run. If the list of Altus users and groups have been synchronized with the cluster, the section displays whether the last synchronization completed or failed and the time it took from the start of the synchronization to completion or failure. You can click Sync Users to manually synchronize the users and groups.

      If the cluster is not a secure cluster, the User Sync Status section does not appear.

    • Total Nodes shows the number of nodes in the cluster. The total number includes the following nodes:
      • Master node
      • Worker nodes
      • Coordinator node

      To view information about the nodes, click View. The Instances window displays the list of instances in the cluster, their instance IDs and IP addresses, and their roles in the cluster. The list of instances does not include the Cloudera Manager instance.

    • Environment displays the Altus environment used by the cluster.
    • Region indicates the region where the cluster is created.
    • CDH Version shows the version of CDH that is used for the cluster.
    • CRN shows the Cloudera Resource Name (CRN) assigned to the cluster. Because the CRN is a long string of characters, Altus provides a copy icon so you can easily copy the CRN for any purpose.

Terminating a Cluster

To terminate a cluster on the console:
  1. Sign in to the Cloudera Altus console:

    https://console.altus.cloudera.com/

  2. On the side navigation panel, go to the Data Warehouse section and click Clusters.

    By default, the Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can filter the list by environment and status. You can also search for clusters by name.

  3. Click the name of the cluster to terminate.

    On the Cluster details page, review the cluster information to verify that it is the cluster that you want to terminate.

  4. Click Actions and select Delete Cluster.
  5. Click OK to confirm that you want to terminate the cluster.