Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop

Cloudera University’s three-day data analyst training course focusing on Apache Pig and Hive and Cloudera Impala will teach you to apply traditional data analytics and business intelligence skills to Big Data.

Date: Tuesday, Jun 04 2013

Description

Cloudera presents the tools participants need to access, manipulate, and analyze complex data sets using SQL and familiar scripting languages. Apache Hive makes multi-structured data accessible to analysts, database administrators, and others without Java programming expertise. Apache Pig applies the fundamentals of familiar scripting languages to the Hadoop cluster. Cloudera Impala enables real-time interactive analysis of the data stored in Hadoop via a native SQL environment. Through lectures and interactive, hands-on exercises, attendees will cover the full Hadoop ecosystem, learning topics such as:

  • The fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with Hadoop tools
  • Joining multiple data sets and analyzing disparate data with Pig
  • Organizing data into tables, performing transformations, and simplifying complex queries with Hive
  • Performing real-time interactive analyses on massive data sets stored in HDFS or HBase using SQL with Impala
  • How to pick the best tool for a given task in Hadoop, achieve interoperability, and manage recurring workflows

Next Steps