Your browser is out of date

Update your browser to view this website correctly. Update my browser now



The exam is intended for developers working with Spark Core and Spark SQL applications in Scala or Python.

  • Time Limit: 120 minutes
  • Passing Score: 75%
  • Language: English
  • Price: USD $250

Exam Format

The exam is offered on the Hortonworks Data Platform 3.0, managed with Ambari 2.7, which includes Spark. Each candidate will be given access to an HDP cluster along with a list of tasks to be performed on that cluster.

Evaluation, Scoring and Reporting

Exam results are usually reported within 10 business days from Hortonworks University. Proctors and training partners are not authorized to report results directly to candidates. Exam results include the candidate’s final score and the required passing score.

Audience and Pre-requisites

The Minimally Qualified Candidate (MQC) for this certification can develop Hadoop applications for ingesting, transforming, and analyzing data stored in Hadoop using the open-source tools of the Hortonworks Data Platform, including Pig, Hive, Sqoop and Flume. Those certified are recognized as having high level of skill in Hadoop application development and have demonstrated that knowledge by performing the objectives of the HDPCD exam on a live HDP cluster.

Exam Objectives

The HDP Certified Developer (HDPSCD) has two main categories of tasks that involve:

Core Spark

  • Write a Spark Core application in Python or Scala
  • Initialize a Spark application
  • Run a Spark job on YARN
  • Create an RDD
  • Create an RDD from a file or directory in HDFS
  • Persist an RDD in memory or on disk
  • Perform Spark transformations on an RDD such as filtering and aggregations
  • Perform Spark actions on an RDD
  • Create and use broadcast variables and accumulators
  • Configure Spark properties
  • Ingest data using SparkSession
  • Sort results and write out to HDFS or other supported destinations

Spark SQL

  • Create Spark DataFrames from an existing RDD
  • Perform operations on a DataFrame
  • Write a Spark SQL application Use Hive with ORC from Spark SQL
  • Write a Spark SQL application that reads and writes data from Hive tables
  • Invoke SQL API or SparkSession SQL functionality to select and produce results
  • Using join capabilities produce analytic results
  • Rename DataFrame/Dataset columns to produce best results

Spark Streaming

  • Use Spark structured streaming to ingest data in real time
  • Invoke streaming transformations and aggregations to produce analytic results
  • Invoke spark-submit utility on existing Spark application using proper arguments

"Cloudera has not only prepared us for success today, but has also trained us to face and prevail over our big data challenges in the future by using Hadoop."


Have questions? Read our Certification FAQ

Contact us at

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.