The exam is intended for developers working with Spark Core and Spark SQL applications in Scala or Python.
- Time Limit: 120 minutes
- Passing Score: 75%
- Language: English
- Price: USD $250
The exam is offered on the Hortonworks Data Platform 3.0, managed with Ambari 2.7, which includes Spark. Each candidate will be given access to an HDP cluster along with a list of tasks to be performed on that cluster.
Evaluation, Scoring and Reporting
Exam results are usually reported within 10 business days from Hortonworks University. Proctors and training partners are not authorized to report results directly to candidates. Exam results include the candidate’s final score and the required passing score.
Audience and Pre-requisites
The Minimally Qualified Candidate (MQC) for this certification can develop Hadoop applications for ingesting, transforming, and analyzing data stored in Hadoop using the open-source tools of the Hortonworks Data Platform, including Pig, Hive, Sqoop and Flume. Those certified are recognized as having high level of skill in Hadoop application development and have demonstrated that knowledge by performing the objectives of the HDPCD exam on a live HDP cluster.
The HDP Certified Developer (HDPSCD) has two main categories of tasks that involve:
- Write a Spark Core application in Python or Scala
- Initialize a Spark application
- Run a Spark job on YARN
- Create an RDD
- Create an RDD from a file or directory in HDFS
- Persist an RDD in memory or on disk
- Perform Spark transformations on an RDD such as filtering and aggregations
- Perform Spark actions on an RDD
- Create and use broadcast variables and accumulators
- Configure Spark properties
- Ingest data using SparkSession
- Sort results and write out to HDFS or other supported destinations
- Create Spark DataFrames from an existing RDD
- Perform operations on a DataFrame
- Write a Spark SQL application Use Hive with ORC from Spark SQL
- Write a Spark SQL application that reads and writes data from Hive tables
- Invoke SQL API or SparkSession SQL functionality to select and produce results
- Using join capabilities produce analytic results
- Rename DataFrame/Dataset columns to produce best results
- Use Spark structured streaming to ingest data in real time
- Invoke streaming transformations and aggregations to produce analytic results
- Invoke spark-submit utility on existing Spark application using proper arguments
"Cloudera has not only prepared us for success today, but has also trained us to face and prevail over our big data challenges in the future by using Hadoop."
Contact us at email@example.com