X

Cloudera Tutorials

Optimize your time with detailed tutorials that clearly explain the best way to deploy, use, and manage Cloudera products. Login or register below to access all Cloudera tutorials.

By registering or submitting your data, you acknowledge, understand, and agree to Cloudera's Terms and Conditions, including our Privacy Statement.
By checking this box, you consent to receive marketing and promotional communications about Cloudera’s products and services and/or related offerings from us, or sent on our behalf, in accordance with our Privacy Statement. You may withdraw your consent by using the unsubscribe or opt-out link in our communications.

Cloudera acquires Octopai's platform to enhance metadata management capabilities

Read the press release

 

Introduction

 

This tutorial walks you through the installation of Cloudera Base on premises (trial version). We will use AWS for the infrastructure as a service (IaaS) to create the underlying infrastructure needed.

At the time of writing,  Cloudera Base on Private Cloud is built using Cloudera Manager 7.4.4 and Cloudera Runtime 7.1.7.

 

 

Prerequisites

 

  • Have administrative access to an AWS account
  • Installed and configured AWS CLI
  • Installed jq, a command-line tool for parsing JSON. Refer to stedolan.github.io for assistance on installing.

 

 

Watch Video

 

The video below provides a brief overview of what is covered in this tutorial:

 

 

Download Assets

 

There are two (2) options in getting assets for this tutorial:

  1. Download a ZIP file

It contains only necessary files used in this tutorial. Unzip tutorial-files.zip and remember its location.

  1. Clone our GitHub repository

It provides assets used in this and other tutorials; organized by tutorial title.

 

 

Create Infrastructure

 

In the Downloads Assets section, bash script create-iaas.sh was provided. This script facilitates the creation of the infrastructure and requirements needed to install Cloudera Base on Private Cloud (trial version) using AWS as the cloud service provider. 

There is no need to edit or modify the script.

IMPORTANT: Make note of the script output, SUMMARY and ACTION ITEMS sections. It provides information about the infrastructure created and the necessary action items you must take.

 

On your local computer’s command-line, issue the following command:

./create-iaas.sh

 

output-create-summary

 

Map IP Addresses

 

Based on the infrastructure the script created, your hosts file needs to be updated to map the IP addresses to host names.

From the script output, select and copy the text underneath 1. APPEND TO HOST FILE: and append it to your host file.

NOTE: You must have administrator/sudo privileges to modify the host file.

 

Host file location depends on your operating system:

macOS: /private/etc/hosts

Linux: /etc/hosts

Windows: c:\Windows\System32\drivers\etc\hosts

 

output-create-host-file

 

Run Commands on each Node

 

As explained in the script output, under ACTION ITEMS, run the specified commands on each node.

NOTE: We deliberately update the primary node last; not exiting the terminal. We need to run more commands in the next step.

 

Reason for updates:

 

output-create-run-commands

 

Download Trial Version

 

Let’s download the trial version of Cloudera Base on premises from the Cloudera Downloads site:

  1. Select Cloudera on Private Cloud, Free Trial.
  2. Select Cloudera Base on Private Cloud, Try now
  3. Select Trial Version Cloudera on Private Cloud, TRY NOW
  4. Follow the Trial Installation instructions provided on the page.

    NOTE: In the previous step, we deliberately left the ssh connection opened for the primary node. This is where we will run the Cloudera Manager installer, cloudera-manager-installer.bin.

 

download-trial-version

 

Install Cloudera Manager Server

 

Let’s run the Cloudera Manager installer, cloudera-manager-installer.bin, on the primary node as explained earlier.

After reading Cloudera Manager README, click Next.

 

cm-installer-readme

 

After reading the Cloudera Standard License agreement, click Next.

If you agree to the license agreement, click Yes to accept the license.

 

cm-installer-license

 

It will take a few minutes to install Cloudera Manager Server.

When it completes, you will be asked to point your browser to the URL provided and log in using credentials provided.

NOTE: In the Map IP Addresses section, you modified the hosts file.

cm-installer-open-bowser

 

Select OK to close the window.

Select OK to exit the Cloudera Manager installer.

Type exit on the primary node’s command-line to close the ssh connection.

 

Install Cluster

 

Now that you have opened your web browser to http://cdptrial:7180 and logged in to Cloudera Manager, let’s configure Cloudera Manager.

When asked for License File, select Try Cloudera Data Platform for 60 days, accept the Terms and Conditions and select Continue.

 

cm-install-cluster-license

 

After reading the welcoming message, Select Continue.

NOTE: The warnings received about AutoTLS and KDC can be ignored since we’re installing the trial version.

 

cm-install-cluster-welcome

 

Cluster Basics

 

Choose a name for your cluster - we will use the default, Cluster 1.

Select Continue

 

cm-install-cluster-basics

 

Specify Hosts

 

Enter the cluster host names or IP addresses in the Hostname field.

The IP addresses, CDP Hosts, are provided in the create-iaas.sh script output. 

 

output-create-hosts

 

Make sure the SSH Port is 22, select Search, then Continue.

IMPORTANT: Make sure all hostnames are successfully scanned and select all IP address.

 

cm-install-cluster-hosts

 

Select Repository

 

  1. Select Cloudera Repository for the repository location.
  2. Select Use Parcels (Recommended)
  3. Click on Parcel Repositories & Network Settings to modify settings
    1. Remove the following URLs by clicking on  next to the parcel URL:
      https://archive.cloudera.com/p/cdh7/{latest_supported}/parcels/
      https://archive.cloudera.com/p/cdh6/{latest_supported}/parcels/
      https://archive.cloudera.com/p/cdh5/parcels/latest
      https://archive.cloudera.com/accumulo-c5/parcels/latest/
      https://archive.cloudera.com/accumulo6/6.1/parcels/
      https://archive.cloudera.com/kafka/parcels/latest/
      
    2. Select Save & Verify Configuration
      Confirm that all URLs have (HTTP Status: 200)
    3. Select Close
  4. Select Continue

 

cm-install-cluster-select-repository

 

Select JDK

 

  1. Select Install a Cloudera-provided version of OpenJDK
  2. Click Continue

 

cm-install-cluster-jdk

 

Enter Login Credentials

 

  1. Select Another user, centos
  2. Select All hosts accept same private key
  3. Select Choose File and provide location of cdp-trial-key.pem. For example, ~/.ssh/cdp-trial-key.pem.
  4. Select Continue

 

cm-install-cluster-login-credentials

 

Install Agents

 

Cloudera Manager agent will be installed on all nodes - this will take a few minutes.

 

cm-install-cluster-agents

 

Install Parcels

 

Parcels will be installed on all nodes - it will take some time.

 

cm-install-cluster-parsels

 

Inspect Cluster

 

Let’s inspect the cluster and resolve any issues before proceeding.

  1. Select Inspect Network Performance
  2. Select Inspect Hosts

    NOTE: You may receive a warning,Transparent Huge Page Compaction is enabled, which may be ignored. However, all other issues should be resolved before continuing.

  3. Select I understand the risks of not running the inspections or the detected issues, let me continue with cluster setup.
  4. Select Continue

 

cm-install-cluster-inspection

 

Configure Cluster

 

Select Services

 

You are given a few options of which services to install.

Select Data Engineering and click Continue.

 

cm-configue-cluster-services

 

Assign Roles

 

Cloudera Manager pre-assigns these roles. Click Continue.

NOTE: If the Kafka broker role is not assigned, assign it to a host with the fewest number of roles.

 

cm-configue-cluster-roles

 

Setup Database

 

This step configures and tests the database connections.

Make sure Use Embedded Database is selected.

NOTE: You may receive a warning stating that PostgreSQL database is not supported for use in production environments. This warning can be ignored because we are using the trial version.

Scroll down to the bottom of the page and select Test Connection. All tested connections should be successful.

Click Continue.

 

cm-configue-cluster-database

 

Enter Required Parameters

 

Choose a secure password for each and make a note of it.

Select Continue

 

cm-configue-cluster-parameters

 

Review Changes

 

Review the configuration attributes - they should be all correct.

Select Continue

 

cm-configue-cluster-review

 

Command Details

 

Additional services are installed and configured - it will take some time..

NOTE: If you encounter an error, retry by selecting Resume. It may be a timing issue while running steps in parallel.

Select Continue

 

cm-configue-cluster-details

 

Cluster Summary

 

Your Cloudera Private Cloud Base (trial version) cluster has been installed, configured and ready to be used.

Click Finish

 

cm-configue-cluster-summary

 

The Cloudera Manager Home screen should look like this:

 

cm-home

 

Summary

 

Congratulations on completing the tutorial.

You have successfully installed Cloudera Private Cloud Base (trial version).

 

 

Further Reading

 

Blogs

Other

 

Appendix A: Uninstall Cluster

 

When you are ready to terminate infrastructure used for Cloudera Private Cloud Base (trial version), using your computer’s command-line, issue the following command. It was provided in Download Assets.

./cleanup-iaas.sh

 

output-cleanup

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.