Your browser is out of date

Update your browser to view this website correctly. Update my browser now

×

Get hands-on experience

Through narrated demonstrations and hands-on exercises, learners gain familiarity with CDSW and develop the skills required to:

  • Navigate CDSW’s options and interfaces with confidence
  • Create projects in CDSW and collaborate securely with other users and teams
  • Develop and run reproducible Python and R code
  • Customize projects by installing packages and setting environment variables
  • Connect to a secure (Kerberized) Cloudera cluster
  • Work with large-scale data using Apache Spark 2 with PySpark and sparklyr
  • Perform full exploratory data science and machine learning workflows in CDSW using Python or R—read, inspect, transform, visualize, and model data
  • Work collaboratively using CDSW together with Git

What To Expect

This course is designed for learners at organizations using CDSW under a Cloudera Enterprise license or a trial license. The learner must have access to a CDSW environment on a Cloudera cluster running Apache Spark 2. Some experience with data science using Python or R is helpful but not required. No prior knowledge of Spark or other Hadoop ecosystem tools is required.

Book the course

How would you like to train?

Course Outline

Overview of CDSW 

  • Introduction to CDSW
  • How to Access CDSW
  • Navigating around CDSW
  • User Settings
  • Hadoop Authentication

Projects in CDSW 

  • Creating a New Project
  • Navigating around a Project
  • Project Settings

The CDSW Workbench Interface 

  • Using the Workbench
  • Using the Sidebar
  • Using the Code Editor
  • Engines and Sessions

 

Running Python and R Code in CDSW 

  • Running Code
  • Using the Session Prompt
  • Using the Terminal
  • Installing Packages
  • Using Markdown in Comments

Using Apache Spark 2 in CDSW

  • Scenario and Dataset
  • Copying Files to HDFS
  • Interfaces to Apache Spark 2
  • Connecting to Spark
  • Reading Data
  • Inspecting Data

Exploratory Data Science in CDSW

  • Transforming Data
  • Using SQL Queries
  • Visualizing Data from Spark
  • Machine Learning with MLlib
  • Session History

Teams and Collaboration in CDSW

  • Collaboration in CDSW
  • Teams in CDSW
  • Using Git for Collaboration
  • Conclusion

Cloudera has not only prepared us for success today, but has also trained us to face and prevail over our big data challenges in the future.

Persado

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.