Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

CDH is Cloudera's software distribution containing Apache Hadoop and related projects. All components are 100% open source (Apache License); see Release Notes. Unless otherwise specified, use these installation instructions for all CDH components.

Apache Avro

Release: 1.7.6
Data serialization: rich data structures, a fast/compact binary format, and RPC.

Apache Crunch

Release: 0.11.0
Java library for more easily writing, testing, and running MR pipelines. Only in CDH!

Apache DataFu

Release: 1.1.0 (incubating)
Library of useful statistical UDFs for doing large-scale analysis. Only in CDH!

Apache Flume

Release: 1.6.0
Collects/aggregates event data and streams it into HDFS or HBase in real time.

Apache Hadoop

Release: 2.6.0
Infinitely scalable storage, resource management, and processing.

Apache HBase

Release: 1.2.0
Scalable record and table storage for Hadoop with random read/write access.

Apache Hive

Release: 1.1.0
SQL framework for doing batch transformation (ETL) of Hadoop data.

Apache Kafka

Release: 0.9.0
Kafka is distributed, resilient, publish-subscribe messaging service.

HUE

Release: 3.10.0
Web-based GUI that makes it easy for users to work with Hadoop data.

Apache Impala

Release: 2.6.0 (incubating)
For high-concurrency, low-latency SQL queries across HDFS, S3, or HBase.

Kite SDK

Release: 1.0.0
APIs, examples, and docs for building apps on top of Hadoop. Only in CDH!

Apache Parquet

Release: 1.5
Provides compressed, efficient columnar data representation in Hadoop.

Apache Mahout

Release: 0.9.0
Libraries for clustering, classification and collaborative filtering.

Apache Oozie

Release: 4.1.0
A workflow scheduler for managing all your Hadoop jobs efficiently.

Apache Pig

Release: 0.12.0
Offers a framework for batch analysis of large data sets using a high-level language.

Cloudera Search

Release: 1.0.0
Offers free-text, Google-style search of Hadoop data for business users. Only in CDH!

Apache Sentry

Release: 1.5.1
Provides granular, role-based access control for Hadoop users.

Apache Spark

Release: 1.6
Does in-memory processing to make jobs faster and easier to write.

Apache Sqoop

Release: 1.4.6 / 1.99.5
Moves data across relational databases and HDFS in a highly scalable way.

Apache ZooKeeper

Release: 3.4.5
Highly reliable distributed coordination service used in HBase, among other places.