Cloudera’s Hadoop Distribution
A free, stable distribution offering RPM, Debian, AWS and automatic configuration options.
Find out why CDH is the most popular Hadoop distribution available today.
Features

- HDFS – Self healing distributed file system
- MapReduce – Powerful, parallel data processing framework
- Hadoop Common – a set of utilities that support the Hadoop subprojects
- HBase – Hadoop database for random read/write access
- Hive – SQL-like queries and tables on large datasets
- Pig – Dataflow language and compiler
- Oozie – Workflow for interdependent Hadoop jobs
- Sqoop – Integrate databases and data warehouses with Hadoop
- Flume – Highly reliable, configurable streaming data collection
- Zookeeper – Coordination service for distributed applications
- Hue – User interface framework and SDK for visual Hadoop applications
Only Cloudera’s Distribution for Hadoop is…
- Hardened. Patched with future improvements that improve stability and performance.
- Integrated and simplified. Cloudera manages cross-component integration, versions, and interdependencies.
- Functionally rich. The broadest feature set of any Hadoop distribution.
- Proven in the enterprise. In use in financial services, telecom, web, manufacturing, media, and retail industries.
- Flexible. Run CDH on premises or in the cloud, on multiple OS versions with multiple installation options.
- Supported. Backed by the project founders and committers.








