Cloudera’s Hadoop Distribution

A free, stable distribution offering RPM, Debian, AWS and automatic configuration options.

Hear from Doug Cutting on Cloudera’s Distribution for Hadoop

Cloudera’s Distribution for Hadoop (CDH) sets a new standard for Hadoop-based data management platforms. It is the most comprehensive platform available today and significantly accelerates deployment of Hadoop in your organization. CDH is based on the most recent stable version of Apache Hadoop. It includes some useful patches backported from future releases, as well as improvements we have developed for our customers.

Find out why CDH is the most popular Hadoop distribution available today.

Features

CDH Diagram

  • HDFS – Self healing distributed file system
  • MapReduce – Powerful, parallel data processing framework
  • Hadoop Common – a set of utilities that support the Hadoop subprojects
  • HBase – Hadoop database for random read/write access
  • Hive – SQL-like queries and tables on large datasets
  • Pig – Dataflow language and compiler
  • Oozie – Workflow for interdependent Hadoop jobs
  • Sqoop – Integrate databases and data warehouses with Hadoop
  • Flume – Highly reliable, configurable streaming data collection
  • Zookeeper – Coordination service for distributed applications
  • Hue – User interface framework and SDK for visual Hadoop applications

Only Cloudera’s Distribution for Hadoop is…

  • Hardened. Patched with future improvements that improve stability and performance.
  • Integrated and simplified. Cloudera manages cross-component integration, versions, and interdependencies.
  • Functionally rich. The broadest feature set of any Hadoop distribution.
  • Proven in the enterprise. In use in financial services, telecom, web, manufacturing, media, and retail industries.
  • Flexible. Run CDH on premises or in the cloud, on multiple OS versions with multiple installation options.
  • Supported. Backed by the project founders and committers.

Download CDH

Free = good

Apache 2.0 License

Cloudera’s Distribution for Hadoop is released under the Apache 2.0 license, and is distributed for free through our public YUM and APT repositories.

We also contribute our Hadoop fixes back to the open source community.

Technology you can trust

check amazon web services rackspace
check centos redhat
check debian softlayer
check fedora ubuntu

Our distribution is well-tested on Red Hat variants including CentOS 5, RHEL5, and FC8; Debian platforms such as Ubuntu; and hosted cloud offerings from Amazon Web Services, Rackspace, and SoftLayer.