Hadoop Tutorial

This document describes the most important user-facing facets of the Apache Hadoop MapReduce framework and serves as a tutorial. Apache Hadoop MapReduce consists of client APIs for writing applications and a runtime on which to run the applications. There are two versions of the API: old and new, and two versions of the runtime: MRv1 and MRv2. This tutorial describes the old API and MRv1.

Prerequisites

Ensure that CDH is installed, configured, and running. The easiest way to get going quickly is to use a CDH4 QuickStart VM.