Resources for Developers
Cloudera University's e-learning courses present a deeper dive into the projects, skills, and techniques that aid and complement the core topics covered by the developer learning path. These on-demand videos address the concepts required to achieve true expertise. They also include interactive demonstrations and lab instruction so that you can work your way through technical challenges in your own time and at your own pace.
Online Training: Introduction to Hadoop and MapReduce
Start on your path to big data expertise with our open, online Udacity course. Cloudera University’s free three-lesson program covers the fundamentals of Hadoop, including getting hands-on by developing MapReduce code on data in HDFS.
Learn how Apache Hadoop addresses the limitations of traditional computing, helps businesses overcome real challenges, and powers new types of big data analytics. This series also introduces the rest of the Apache Hadoop ecosystem and outlines how to prepare the data center and manage Hadoop in production.
At their core, YARN and MapReduce 2’s improvements separate cluster resource management capabilities from MapReduce-specific logic. YARN enables Hadoop to share resources dynamically between multiple parallel processing frameworks such as Cloudera Impala, allows more sensible and finer-grained resource configuration for better cluster utilization, and scales Hadoop to accommodate more and larger jobs.
In this on-demand webinar, you will learn who is best suited to attend the live training, what prior knowledge you should have, and what topics the course covers. We present a short portion of Cloudera's actual Hadoop Developer Training, discussing the difference between New and Old APIs, why there are different APIs, and which you should use when writing your MapReduce code.
Learn what the course covers, from capturing data to building a search interface. We cover the spectrum of processing engines, Apache projects, and ecosystem tools available for converged analytics, discuss who is best suited to attend the full Big Data Applications course and what prior knowledge you should have. This free webinar includes a portion of the live class and the benefits of building applications with an enterprise data hub.
Learn what Apache Spark is and how it compares to Hadoop MapReduce, how to filter, map, reduce, and save Resilient Distributed Datasets (RDDs), who is best suited to attend the full Developer Training for Apache Spark course, what prior knowledge you should have, and the benefits of building Spark applications as part of an enterprise data hub.
Learn how HBase Training can help you develop your HBase use case, design optimal schemas, and identify, avoid, and resolve performance bottlenecks. In this on-demand webinar, we present two short portions of Cloudera's full HBase Training, providing an overview of accessing data with the HBase API and executing Scan operations with both Java and Python, followed by Q&A.
Pig is an Apache project that uses a scripting language to query and analyze large data sets. With Apache Pig, users can create MapReduce programs without writing Java code. This e-learning module teaches you how to write user-defined functions (UDFs) that can be executed inside of Pig to extend performance and develop a custom library of operations. We discuss what Pig UDFs are, supported functions and languages, and how to write custom UDFs in Java and Python. The module includes a hands-on exercise where you will write your own UDF in Python, complete with a sample solution.
Hive is an Apache project that facilitates ad hoc queries and analyses of large data sets in the Hadoop cluster using a SQL-like language. This e-learning module teaches you how to write user-defined functions (UDFs) to augment Hive's built-in capabilities. We discuss why UDFs are necessary, what kinds of UDFs exist, and how to write custom UDFs in Java. The module includes a hands-on exercise where you will write your own UDF, complete with a sample solution.
In this video, you will learn what data scientists do, how they think about problems, the relationship between data science and Hadoop, and how Cloudera‘s Introduction to Data Science course can help you join this growing and increasingly important profession, followed by Q&A with Cloudera Senior Director of Data Science Josh Wills.