Getting started with the Apache Hadoop stack can be a challenge, whether you’re a computer science student or a seasoned developer. There are many moving parts, and unless you get hands-on experience with each of those parts in a broader use-case context with sample data, the climb will be steep.
Following this tutorial using Cloudera's QuickStart VM or Docker image as a sandbox environment will give you examples of how to get started with some of the tools provided in CDH — Cloudera's platform containing Hadoop and related projects — and how to manage your services via Cloudera Manager. It will also give you a taste of what it means to “Ask bigger questions.” By the end of this tutorial, you will:
- Understand how to use some of the powerful tools in CDH
- Know how to set up and execute some basic business intelligence and analytics use cases
- Be able to explain to your manager why you deserve a raise!
Check out our community for FAQs or post to our Discussion Forum if you run into any issues.
Note: Cloudera does not support CDH cluster deployments using hosts in Docker containers. Some parts of this tutorial require Cloudera Manager to be running; other parts also require an enterprise license or trial to be activated. To enable these parts of the tutorial, choose one of the following options:
- To use Cloudera Express (free), run “Launch Cloudera Express” on the Desktop in Cloudera Manager. This requires at least 8GB of RAM and at least 2 virtual CPUs.
- To begin a 60-day trial of Cloudera Enterprise with advanced management features, run Launch Cloudera Enterprise (trial) on the Desktop. This requires at least 10GB of RAM and at least 2 virtual CPUs.
Then, you can get started.