The Hadoop Ecosystem
Learn About the Projects Comprising an Enterprise Data Hub
CDH is Cloudera's 100% open-source distribution and the world's leading Apache Hadoop solution. More enterprises have downloaded CDH than all other distributions combined. Along with open-source projects like Apache Hive, Pig, and HBase, and Cloudera's solutions, including Impala, Search, Cloudera Manager, Navigator, and Enterprise BDR, CDH enables a fully enterprise-ready Hadoop experience so that you can derive the most value from all your data.
These tools provide the core functionality to allow you to store both complex and structured data and perform sophisticated processing and analysis. This video demystifies Hadoop and explains how it works, giving you an understanding of how components fit together and build on one another to provide a scalable and powerful system.
Learn about the projects surrounding Apache Hadoop, which complete the greater ecosystem of available big data processing tools.
At their core, YARN and MapReduce 2’s improvements separate cluster resource management capabilities from MapReduce-specific logic. YARN enables Hadoop to share resources dynamically between multiple parallel processing frameworks such as Cloudera Impala, allows more sensible and finer-grained resource configuration for better cluster utilization, and scales Hadoop to accommodate more and larger jobs.
Hive enables analysis of large data sets using a language very similar to standard ANSI SQL. This means anyone who can write SQL queries can access data stored on the Hadoop cluster. This tutorial introduces the functionality of Hive, as well as its various applications for data analysis and data warehousing.
Pig is a simple-to-understand data flow language used in the analysis of large data sets. Pig scripts are automatically converted into MapReduce jobs by the Pig interpreter, so you can analyze the data in a Hadoop cluster even if you aren't familiar with MapReduce. Find out more about Pig use cases, Pig Latin and the benefits of utilizing Pig.
Work at the speed of thought! This e-learning course explores Cloudera Impala's features, architecture, and benefits over legacy Hadoop platforms. Learn how to run interactive queries inside Impala and understand how it optimizes data systems. This free online course includes a training module, homework, and an Impala demo VM download to experiment with this powerful new tool.
Learn how to use interactive, full-text search to quickly find relevant data in Hadoop and solve critical business problems simply and in real time. Cloudera Search combines the established, feature-rich, open-source search platform of Apache Solr and its extensible APIs for easy integration with CDH. In this e-learning module, you will learn the fundamentals, use cases, and features of Cloudera Search. The module includes a short discussion of Cloudera Search architecture and a product demonstration.
This brief introduction to HBase, Hadoop's database, explains HBase usage scenarios, how HBase compares to an RDBMS and how HBase complements Hadoop. Watch now.
Cloudera Manager simplifies deployment, configuration, diagnostics, and reporting for CDH in production. Learn how to set up and customize Cloudera Manager to monitor and improve the performance of any size Hadoop cluster, increase compliance, and reduce costs.
Learn the objectives and features of Cloudera Enterprise BDR and see a demonstration of the new backup and disaster recovery product. Centrally configure and manage disaster recovery workflows for files (HDFS) and metadata (Hive) through an easy-to-use graphical interface. Consistently meet or exceed Service Level Agreements (SLAs) and Recovery Time Objectives (RTOs) through simplified management and process automation.
- Online Training: Hadoop Essentials
Get started on your journey to Hadoop.
- Webinar: Get Started with Hadoop in Less Than Thirty Minutes
Learn how to get a Hadoop POC cluster up and running in the cloud in no time.
- Webinar: Hadoop: Extending Your Data Warehouse
Find out how Hadoop can rationalize your current data infrastructure to minimize additional budget outlay, offload data transformation cycles, and enable data warehouse optimization.
- Webinar: Enterprise Data Hub
Check out the next big thing driving business value from Big Data.
- If you're a developer, administrator, data analyst, HBase specialist, or aspiring data scientist, Cloudera offers training and certification to meet your needs.