Your browser is out of date

Update your browser to view this website correctly. Update my browser now



Enterprise data science teams need access to business data, tools, and computing resources required to develop and deploy machine learning workflows. Cloudera Machine Learning, part of the Cloudera Data Platform (CDP), provides the solution, giving data science teams the required resources while enabling administrators to manage costs. Cloudera Machine Learning Training, offered through OnDemand, introduces Cloudera Machine Learning and prepares students to use it for data science and machine learning workflows with either the Python or R programming languages.

What You Will Learn

  • Through narrated lecture and recorded demonstrations, you will learn how to:
  • Create and customize projects in Cloudera Machine Learning (CML)
  • Edit and run Python or R code in CML projects
  • Load and transform data in CML using popular Python or R packages and using Apache Spark with PySpark or sparklyr
  • Implement a machine learning workflow in CML by analyzing and visualizing data, training and testing a model, measuring and tracking the model, and deploying the model to generate predictions
  • Use CML together with the Git version control system

What to Expect

This course is intended for Python and R users at organizations running CML under a trial license or commercial license.

Book the course

How would you like to train?

Course Outline

Introduction to CML

Using CML for the First Time

  • How to Access CML
  • Navigating CML
  • User Settings in CML
  • Kerberos Authentication in CML

Projects in CML

  • Creating a New Project in CML
  • Project Pages
  • Project Files
  • Project Settings

The CML Workbench Interface/b>

  • The CML Workbench Interfaces
  • Using the Sidebar
  • Using the Code Editor
  • Keyboard Shortcuts in the Code Editor

Engines and Sessions in CML

  • Engines and Sessions
  • Session Options
  • Starting and Stopping Sessions
  • Using the Terminal
  • Using the Session Prompt

Running Python Code in CML (Python Track)

  • Running Python Code
  • Using Jupyter Magic Commands
  • Installing Python Packages
  • Viewing Python Documentation
  • Using Markdown in Python Code
  • Dataset and Example
  • Computing Inside a CML Session
  • Remote/Distributed Computing

Running R Code in CML (R Track)

  • Running R Code
  • Interrupting R Code
  • Installing R Packages
  • Viewing R Documentation
  • Using Markdown in R Code
  • Dataset and Example
  • Accessing Data with R
  • Computing Inside a CML Session
  • Remote/Distributed Computing

Using Spark with Python in CML (Python Track)

  • Using PySpark
  • Starting a Spark Session with PySpark
  • Reading Data with PySpark
  • Inspecting Data with PySpark
  • Transforming Data with PySpark
  • Using SQL with PySpark

Using Spark with R in CML (R Track)

  • Using sparklyr
  • Installing sparklyr
  • Starting a Spark Session with sparklyr
  • Reading Data with sparklyr
  • Inspecting Data with sparklyr
  • Transforming Data with sparklyr
  • Using SQL with sparklyr

Machine Learning with Python in CML (Python Track)

  • Machine Learning with Python
  • Data Visualization with Python
  • Machine Learning with scikit-learn
  • Sharing Results Using CML

Machine Learning with R in CML (R Track)

  • Machine Learning with R
  • Data Visualization with R
  • Machine Learning with tidymodels
  • Sharing Results Using CML

Experiments and Models with Python in CML (Python Track)

  • Experiments and Models
  • Running Experiments with Python
  • Using Experiment Results with Python
  • Preparing to Deploy Models with Python
  • Deploying Models with Python
  • Calling Models Deployed in CML

Experiments and Models with R in CML (R Track)

  • Experiments and Models
  • Running Experiments with R
  • Using Experiment Results with R
  • Preparing to Deploy Models with R
  • Deploying Models with R
  • Calling Models Deployed in CML

Teams and Collaboration in CML

  • Collaborating on Projects
  • Forking a Project
  • Teams
  • Limitations of Teams and Collaboration

Using Git with CML

  • Using Git with CML
  • Configuring SSH Keys
  • Cloning a Git Repository
  • Contributing to a Git Repository

Cloudera Machine Learning brings the agility and economics of cloud to self-service machine learning workflows with governed business data and tools that data science teams need, anywhere.

Learn more

Advance your career

Big data developers are among the world's most in-demand and highly-compensated technical roles. Check out some of the job opportunities currently listed that match the professional profile, many of which seek CCA qualifications.

Private training

We also provide private training at your site, at your pace, and tailored to your needs.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.