Cloudera Essentials for Apache Hadoop
Get Started on Your Journey to Hadoop
Learn how Apache Hadoop addresses the limitations of traditional computing, helps businesses overcome real challenges, and powers new types of Big Data analytics. This series also introduces the rest of the Apache Hadoop ecosystem and outlines how to prepare the data center and manage Hadoop in production.
Explore the basics of Apache Hadoop, including the Hadoop Distributed File System (HDFS), MapReduce, and the anatomy of a Hadoop cluster.
There are many components working together in the Apache Hadoop stack. By understanding how each functions, you gain more insight into Hadoop’s functionality in your own IT environment. This chapter goes beyond the motivation for Apache Hadoop and dissects the Hadoop Distributed File System (HDFS), MapReduce, and the general topology of a Hadoop cluster.
Learn how Apache Hadoop is used in the real world. This chapter explores ways to use Apache Hadoop to harness Big Data and solve business problems in ways never before imaginable. Explore common business challenges that can be addressed using Hadoop, the origins of Big Data, types of analyses powered by Hadoop, and industry use cases for Hadoop.
Various projects make up the Apache Hadoop ecosystem, and each improves data storage, management, interaction, and analysis in its own unique way. This chapter reviews Hive, Pig, Impala, HBase, Flume, Sqoop, and Oozie, how they function within the stack and how they help integrate Hadoop within the production environment.
It is critical to understand how Apache Hadoop will affect the current setup of the data center and to plan ahead. This chapter helps you seamlessly integrate the platform into your environment. Find out what resources are required to deploy Hadoop, how to plan for cluster capacity, and how to staff for your Big Data strategy.
Once you have Hadoop implemented in your environment, what’s next? How do you get the most out of the technology while managing it on a daily basis? This chapter reviews the previous topics, introduces CDH (Cloudera's Distribution Including Apache Hadoop), and describes how Cloudera can help you maximize the value of all your data.
- Online Training: Introduction to Hadoop and MapReduce
Explore the basics of developing code on the data in HDFS.
- Online Training: Cloudera Manager
Learn how Cloudera Manager can increase the performance and decrease the costs associated with your cluster.
- Webinar: Enterprise Data Hub
Check out the next big thing driving business value from Big Data.
- Webinar: Insight into the EDH
Learn how the enterprise data hub can transform your business and deliver competitive advantage.
- If you're a developer, administrator, data analyst, HBase specialist, or aspiring data scientist, Cloudera offers training and certification to meet your needs.