Cloudera named a leader in 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems Get the report

The Leader in Hadoop Education Expands Comprehensive Training Curriculum to Address the Demand for Spark

PALO ALTO, Calif., – September 16, 2015  Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™, today announced an expanded Apache Spark training curriculum. These comprehensive, hands-on education courses are tailored to provide developers, analysts, and data scientists with the skills needed to integrate Spark more easily into Hadoop environments. As it quickly becomes the preferred data processing framework for Hadoop, in-depth training to holistically integrate Spark with existing applications must also keep pace. Training thousands of developers worldwide, no one trains as many people on Hadoop - including Spark - as Cloudera. To directly continue addressing the growing skills gap and varying levels of engagement with Spark throughout the user community, Cloudera is offering multiple paths of learning and certification to expand on the current Spark training.

Spark has the core elements to become the next open standard for data processing in Hadoop—allowing for easy development of big data applications that combine batch, streaming, and advanced analytics. However, for companies to take full advantage of its capabilities, even alongside other resources they have available, they must have a firm grasp of Spark within the entire Hadoop ecosystem rather than as a standalone tool.

“Cloudera is the only Hadoop distribution vendor offering an array of in-depth, real-world Spark training courses,” said Mark Morrissey, senior director, Education Services, Cloudera. “Our goal is to teach users how to use Spark alongside other resources they have available in their Hadoop clusters. Whether people are new to Hadoop or have some exposure to it, our curriculum provides an entry point which sets them up for success and helps them to become more resilient as their environment of tools change.”

The Spark learning paths include the following training and certification options:

Developer Training for Spark and Hadoop I | Developer Training for Spark and Hadoop II: Advanced Techniques

•     After completing these courses, individuals will possess a thorough understanding of Cloudera Enterprise’s entire data engineering pipeline from data ingestion to data processing, with Apache Spark serving as the core processing framework.  Students will work with the most popular open source standards, including Spark, Apache Hive, Impala, Apache Sqoop, and Apache Flume, as well as advanced topics including Spark Streaming, Apache Kafka, Apache Solr, and many others. Upon completion of this learning path, students will have the skills necessary to attempt Cloudera’s performance-based CCP: Data Engineer certification.

Developer Training for Spark

•     Our core Spark class focuses solely on Spark, and the aspects of it as a data processing framework. This course is designed for individuals and companies, already familiar with Cloudera Enterprise, that are interested in migrating to Spark.

Data Science at Scale with Spark and Hadoop

•     This course is designed for data scientists interested in applying their analytic skills against massive data sets. Here the processing is abstracted away and the emphasis is on the application. Advanced topics include MLlib (machine learning libraries included in Spark), and building recommenders using Spark and MLlib. Upon completion of this learning path, students will have the foundation necessary to attempt Cloudera’s performance-based CCP: Data Scientist certification.

The Cloudera Academic Program (CAP)

•     The Cloudera Academic Partnership program was founded in 2012 in order to introduce university-level students to the Hadoop ecosystem and prepare them for careers in big data. It gives computer science departments in accredited, nonprofit universities around the world access to free curricula and tools that otherwise would be expensive and time-consuming to develop or acquire on their own. CAP now includes Spark curriculum for universities affiliated with the program. Upon completion of CAP, students have the skills necessary to attempt Cloudera’s performance-based CCA: Spark and Hadoop Developer Certification, demonstrating the hands-on skills required for entry-level positions — a proven starting point for a career in big data.

Cloudera’s certification merits, including the 2015 salary survey for best big data certs and Best Big Data Certifications for 2015 confirm the company’s leadership in technology education.

As the first Hadoop distribution to ship and support Spark, Cloudera has unprecedented expertise and experience to create the most holistic Spark education program. Cloudera has the highest number of committers for Spark of any Hadoop distribution, the deepest level of platform integration, and the most customers running Spark - with over 150 across a wide range of industries and use cases. With considerable insight into the challenges of running Spark in production environments at scale, and deep knowledge of how engineering and analytics teams want to use the framework, Cloudera is uniquely positioned to deliver a comprehensive Spark education program.

With the One Platform Initiative, Cloudera is working with the community to fully unite Spark and Hadoop to enable the next-generation of analytics. Especially as Spark continues to grow in popularity and become better suited to fully replace MapReduce, it is critical that companies have the skills to take full advantage of Spark and turn their data into actionable insights. As the leaders in Spark as part of Hadoop, and the driving force for advancing Spark within the enterprise, Cloudera’s Spark Trainings let companies do exactly that.

About Cloudera

Cloudera is revolutionizing enterprise data management by offering the first unified Platform for big data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Cloudera's open source big data platform is the most widely adopted in the world, and Cloudera is the most prolific contributor to the open source Hadoop ecosystem. As the leading educator of Hadoop professionals, Cloudera has trained over 40,000 individuals worldwide. Over 1,700 partners and a seasoned professional services team help deliver greater time to value. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production.

Connect With Cloudera

Learn more about Cloudera:
Read our blog:
Follow us on Twitter:
Get updates on LinkedIn:
Visit us on Facebook:
See us on YouTube:
Join the Cloudera Community:
Read about our customers' successes:

Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Editionand CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Press Inquiries

Deborah Wiltshire

Keep in touch:

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.