Your browser is out of date

Update your browser to view this website correctly. Update my browser now

×

CCA Data Analyst Exam (CCA159) 

  • Number of Questions: 8–12 performance-based (hands-on) tasks on a Cloudera Enterprise cluster. See below for full cluster configuration
  • Time Limit: 120 minutes
  • Passing Score: 70%
  • Language: English
  • Price: USD $295

Exam Question Format

You are given eight to twelve customer problems with a unique large data set, a CDH cluster, and 120 minutes. For each problem, you must implement a technical solution with a high degree of precision that meets all the requirements. You may use any tool or combination of tools on the cluster (see list below) -- you get to pick the tool(s) that are right for the job. You must possess enough knowledge to analyze the problem and arrive at an optimal approach given the time allowed. You need to know what you should do and then do it on a live cluster, including a time limit and while being watched by a proctor.

Evaluation, Score Reporting, and Certificate

Your exam is graded immediately upon submission and you are e-mailed a score report the same day as your exam. Your score report displays the problem number for each problem you attempted and a grade on that problem. If you fail a problem, the score report includes the criteria you failed (e.g., “Records contain incorrect data” or “Incorrect file format”). We do not report more information in order to protect the exam content. Read more about reviewing exam content on the FAQ.

If you pass the exam, you receive a second e-mail within a few days of your exam with your digital certificate as a PDF, your license number, a Linkedin profile update, and a link to download your CCA logos for use in your personal business collateral and social media profiles

Audience and Prerequisites

Candidates for CCA Data Analyst can be SQL developers, data analysts, business intelligence specialists, developers, system architects, and database administrators. There are no prerequisites.

The CCA Data Analyst exam was created to identify talented SQL developers looking to stand out and be recognized by employers looking for these skills. It is recommended that those looking to achieve this certification start by taking Cloudera’s Data Analyst training course, which has the same objectives as the exam.

 

Required Skills

Prepare the Data

Use Extract, Transfer, Load (ETL) processes to prepare data for queries.

  • Import data from a MySQL database into HDFS using Sqoop

  • Export data to a MySQL database from HDFS using Sqoop

  • Move data between tables in the metastore

  • Transform values, columns, or file formats of incoming data before analysis

Provide Structure to the Data

Use Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala.

  • Create tables using a variety of data types, delimiters, and file formats

  • Create new tables using existing tables to define the schema

  • Improve query performance by creating partitioned tables in the metastore

  • Alter tables to modify existing schema

  • Create views in order to simplify queries

Data Analysis

Use Query Language (QL) statements in Hive and Impala to analyze data on the cluster.

  • Prepare reports using SELECT commands including unions and subqueries

  • Calculate aggregate statistics, such as sums and averages, during a query

  • Create queries against multiple data sources by using join commands

  • Transform the output format of queries by using built-in functions

  • Perform queries across a group of rows using windowing functions

 

Have more questions? Check out our Certification FAQ

Exam delivery and cluster information

CCA159 is a hands-on, practical exam using Cloudera technologies. Each user is given their own CDH5 (currently 5.10.1) cluster pre-loaded with Spark, Impala, Crunch, Hive, Pig, Sqoop, Kafka, Flume, Kite, Hue, Oozie, DataFu, and many others (See a full list). In addition the cluster also comes with Python 2.7 and 3.4, Perl 5.16, Elephant Bird, Cascading 2.6, Brickhouse, Hive Swarm, Scala 2.11, Scalding, IDEA, Sublime, Eclipse, and NetBeans.

Documentation Available online during the exam

Cloudera Product Documentation
Apache Hadoop
Apache Hive
Apache Impala
Apache Sqoop
Spark
Apache Crunch
Apache Pig
Kite SDK
Apache Avro
Apache Parquet
Cloudera HUE
Apache Oozie
Apache Flume
DataFu
JDK 7 API Docs
Python 2.7 Documentation
Python 3.4 Documentation
Scala Documentation

Only the documentation, links, and resources listed above are accessible during the exam. All other websites, including Google/search functionality is disabled. You may not use notes or other exam aids.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.