Getting started with the Apache Hadoop stack can be a challenge, whether you’re a computer science student or a seasoned developer. There are many moving parts, and unless you get hands-on experience with each of those parts in a broader use-case context with sample data, the climb will be steep.
Following this tutorial using Cloudera's QuickStart VM or Docker image as a sandbox environment will not only give you examples on how to get started with some of the tools provided in CDH, Cloudera’s platform containing Hadoop and related projects, and how to manage your services via Cloudera Manager, but also give you a taste of what it means to “Ask bigger questions.” By the end of this tutorial you will:
- Understand how to use some of the powerful tools in CDH
- Know how to setup and execute some basic business intelligence and analytics use cases
- Be able to explain to your manager why you deserve a raise!
If at any point you get stuck, just post a note in our Discussion Forum and we'll get you un-stuck.
Note: Cloudera does not support CDH cluster deployments using hosts in Docker containers. Some parts of this tutorial require Cloudera Manager to be running; other parts also require an enterprise license or trial to be activated. To enable these parts of the tutorial, choose one of the following options:
- To use Cloudera Express (free), run "Launch Cloudera Express" on the Desktop in Cloudera Manager. This requires at least 8GB of RAM and at least 2 virtual CPUs.
- To begin a 60-day trial of Cloudera Enterprise with advanced management features, run Launch Cloudera Enterprise (trial) on the Desktop. This requires at least 10GB of RAM and at least 2 virtual CPUs.
Then, you can get started.