Your browser is out of date!

Update your browser to view this website correctly. Update my browser now


Key resources


Explore CDH, Cloudera’s software platform containing the Apache Hadoop ecosystem


Learn best practices, use cases, and internals from Cloudera engineers and users


Ask questions, get answers, and browse or participate in the Cloudera user community


Explore emerging ecosystem components that could potentially impact the future


Inspect and contribute to Cloudera’s open source software projects, including Apache Impala (incubating)


Watch featured developer video about a variety of subjects (Apache Spark, Apache Kudu, and more)

Join the Developer Program

The Cloudera Developer Program is a low-cost, low-risk way for individuals and small companies to ramp up on Cloudera's platform, while getting help when needed. Sign up online!

What's new

Jan 6, 2017 - Learn how to use Cloudera Director, Microsoft Active Directory (AD DS, AD CS, AD DNS), SAMBA, and SSSD to deploy a secure EDH cluster for workloads in the public cloud.

HDFS DataNode Scanners and Disk Checker Explained

Dec 20, 2016 - As many of us know, data in HDFS is stored in DataNodes, and HDFS can tolerate DataNode failures by replicating the same data to multiple DataNodes. But exactly what happens if some DataNodes’ disks are failing?

How-to: Automate Your Sparklyr Environment With Cloudera Director

Dec 15, 2016 - Sparklyr contains a dplyr interface into Spark and allows users to leverage crucial machine learning algorithms from Spark MLlib and H2O Sparkling Water. This greatly reduces the barrier of entry for R users in adopting Spark as a tool for big data and should go a long way in enabling R workloads to migrate to Hadoop.