Overview of Cloudera Altus

Cloudera Altus is a cloud service platform with services that enable you to use CDH to analyze and process data at scale within a public cloud infrastructure, including Amazon Web Services (AWS) and Microsoft Azure. Altus can provision clusters quickly and make it easy for you to build and run your data workloads in the cloud.

Altus works within the cloud service provider architecture. You can choose the cloud service provider on which Altus creates your clusters and runs your jobs. On AWS, Altus creates clusters in a VPC in your AWS account and Altus jobs read input from and write output to Amazon S3. On Azure, Altus creates clusters in a virtual network (VNet) in your Azure subscription and Altus jobs read input from and write output to Azure Data Lake Store (ADLS).

Altus offers a web user interface, a command line interface (CLI), as well as the Altus SDK for Java. You can use the Altus console, CLI, or SDK to create and manage environments and clusters, run jobs, and perform tasks in Altus. The Altus console provides tools to facilitate administrative tasks, such as setting up an environment and generating access keys. The Altus SDK for Java enables you to connect to the Altus Data Engineering service and perform the Altus tasks from your application.

Cloudera Altus provides the following services for your workloads:
Altus Data Engineering service
The Altus Data Engineering service enables you to create clusters and run jobs specifically for data science and engineering workloads. Altus offers multiple distributed processing engine options, including Hive, Spark, Hive on Spark, and MapReduce2 (MR2), which allow you to manage workloads in ETL, machine learning, and large scale data processing.
Altus Data Warehouse service
The Altus Data Warehouse service enables you to create clusters running the Impala SQL engine to access data in your cloud storage for business analysis and reporting. You can use the query editor in the Altus console to query the data or use standard business intelligence tools with ODBC or JDBC to connect to Data Warehouse clusters to query the data.
Altus Shared Data Experience (SDX) service
The Altus Shared Data Experience (SDX) service provides a consistent view of data for CDH clusters and workloads running on the cloud. The Altus SDX namespace externalizes cluster metadata into a shared, long-running service available to multiple clusters and workloads running on the cloud. You can use the Altus SDX namespace with workloads that run on CDH clusters on the cloud accessing data in Amazon S3 or Azure Data Lake Store (ADLS).

Altus Features

Cloudera Altus provides the following features:
Environments

An Altus environment identifies the resources in your AWS account or Azure subscription to be used for Altus clusters and jobs. The environment allows you to create clusters in multiple AWS accounts or Azure subscriptions from a single Altus account.

You can set up and assign separate Altus environments to different users so they can access only the resources that you allow them to use.

User authorization and access management

You can assign roles to users to manage their Altus privileges and access to resources. Altus provides pre-defined roles that you can assign to users and designated administrators.

Clusters for Altus Data Engineering workloads

The Altus Data Engineering services provisions single-user, transient clusters in your AWS account or Azure subscription. You can easily configure and create a cluster with the compute engine that you require to process your jobs: Hive, Spark, Hive on Spark, or MapReduce2 (MR2). You can also create a cluster that supports multiple compute engines to run your jobs: Hive, Spark, and MapReduce2.

Each cluster has a job queue to manage the jobs that run on the cluster and supports a workflow with a single pipeline.

Clusters for Altus Data Warehouse workloads

The Altus Data Warehouse service provisions clusters in your AWS account or Azure subscription that can be accessed by multiple Altus users. You can easily configure and create a cluster running the Impala SQL engine to enable you to iteratively access your data in your cloud object storage for analysis and reporting.

Altus Data Engineering Jobs

You can submit jobs to run on a cluster that contains the service you need. On AWS, Altus jobs read input from and write output to Amazon S3. On Azure, Altus jobs read input from and write output to Azure Data Lake Store.

The Altus Data Engineering workflow centers on the job. You can submit a job, create a cluster on which to run the job, and terminate the cluster when the job completes, all in a single process. You can access Cloudera Manager where you can view the job history servers for the cluster. You can also generate reports in Workload Analytics to monitor and optimize job performance and troubleshoot issues.

Altus SDX Namespaces

The Altus SDX namespace points to a database that stores metadata for data accessed by CDH clusters on the cloud, providing a common and consistent view of the data to the clusters. When an Altus SDX namespace is shared across multiple Altus clusters that access the same data, the clusters can immediately access the metadata without the need for each cluster to recreate the metadata.

You can use an Altus SDX namespace for clusters that access data in Amazon S3 or in Azure Data Lake Store (ADLS).

Altus Interfaces

Cloudera Altus provides the following user interfaces:

  • Altus console
  • Command line interface (CLI)
  • Altus SDK for Java

Altus Console

The Altus console is the web user interface that provides a visual way to administer Altus users and environments and to create clusters and run jobs.

If you are an Altus administrator, you can perform all tasks on the Altus console. If you are not an administrator, the role and environment assigned to you determine the areas of the console that you can access.

To access the Altus console, go to the following URL: https://console.altus.cloudera.com/

Side Navigation Panel

The Altus console has a side navigation panel with links to the following pages on the console:
Home
Home is the main page for the Altus console. When you log in to the Altus console, the Home page is the first page you view.

The Home page provides links to the services that you have access to and tasks that you can perform on the Altus console.

The What's New section lists the latest features added to Altus. You can also download the latest version of the Altus client.

Environments
Requires Altus administrator privileges.

On the Environment page, you can create an Altus environment using the environment quickstart or wizard. You can clone or delete an environment or assign an environment to a user.

IAM
Requires IAMUser privileges.

The Users page displays the list of Altus users. You can view information about a user, update the roles assigned to the user, or generate an access key for the user.

SDX Namespaces
Requires SDXAdmin privileges.

On the SDX Namespaces page, you can create or delete SDX namespaces.

Altus Data Engineering Clusters
The Altus Data Engineering Clusters page displays the list of all the Altus Data Engineering clusters in your Altus account. You can filter the list of clusters by environment, type, and status. You can create, clone, or delete a cluster.
Altus Data Engineering Jobs
The Jobs page displays the list of all the Altus Data Engineering jobs in your Altus account. You can submit a job to run on an existing cluster or a new cluster.
Altus Data Warehouse Clusters
The Altus Data Warehouse Clusters page displays the list of all the Data Warehouse clusters in your Altus account. You can create, clone, or delete a cluster. You can also use a query editor to create and run SQL queries against data in the Data Warehouse clusters.
Beta Services
When you select Beta Services, you can get information about and request access to Altus services that are in Beta release.

My Account Page

You can view information about your Altus account in the My Account page. To get to the My Account page, click your user name and select My Account.

If you are an Altus administrator, you can change the roles and resources assigned to you. You can create access keys and delete or deactivate the access keys that you have created. Deactivate a key if you do not want the key to be used to access Altus. You can reactivate a key at any time.

If you are not an administrator, you can view your user account information and your access keys.

Altus Client

Altus provides a command-line interface (CLI) through a Python client. If you are an Altus administrator, you can use the CLI to perform all tasks in Altus. If you are not an administrator, the role and environment assigned to you determine the commands that you can run.

For more information about setting up the CLI, see Cloudera Altus Client Setup.

Altus SDK for Java

You can use the Cloudera Altus SDK for Java to programmatically access Altus services and create and manage environments and clusters and to run jobs from your Java application.

For more information about using the Cloudera Altus SDK for Java, see Using the Altus SDK for Java.

Supported Browsers

The Cloudera Altus console is validated and tested against the latest version and supports recent versions of the following browsers

  • Google Chrome
  • Mozilla Firefox
  • Internet Explorer 11
  • Safari 9 or higher
  • Microsoft Edge