Cloudera Data Science Workbench User Guide

As a Cloudera Data Science Workbench user, you can create and run data science workloads, either individually or in teams. Cloudera Data Science Workbench uses the notion of contexts to separate your personal account from any team accounts you belong to. Depending on the context you are in, you will be able to modify settings for either your personal account, or a team account, and see the projects created in each account. Shared personal projects will show up in your personal account context. Context changes in the UI are subtle, so if you're wondering where a project or setting lives, first make sure you are in the right context.

The application header will tell you which context you are currently in. You can switch to a different context by going to the drop-down menu in the upper right-hand corner of the page.



The rest of this topic features instructions for some common tasks a Cloudera Data Science Workbench user can be expected to perform.

Managing your Personal Account

To manage your personal account settings:

  1. Sign in to Cloudera Data Science Workbench.
  2. From the upper right drop-down menu, switch context to your personal account.
  3. Click Settings.
    Profile
    You can modify your name, email, and bio on this page.
    Teams
    This page lists the teams you are a part of and the role assigned to you for each team.
    SSH Keys
    Your public SSH key resides here. SSH keys provide a useful way to access to external resources such as databases or remote Git repositories. For instructions, see SSH Keys.
    Hadoop Authentication
    Enter your Hadoop credentials here to authenticate yourself against the cluster KDC. For more information, see Hadoop Authentication with Kerberos for Cloudera Data Science Workbench.

Managing Team Accounts

Users who work together on more than one project and want to facilitate collaboration can create a Team. Teams allow streamlined administration of projects. Team projects are owned by the team, rather than an individual user. Team administrators can add or remove members at any time, assigning each member different permissions.

Creating a Team

To create a team:

  1. Click the plus sign (+) in the title bar, to the right of the Search field.
  2. Select Create Team.
  3. Enter a Team Name.
  4. Click Create Team.
  5. Add or invite team members. Team members can have one of the following privilege levels:
    • Viewer - Cannot create new projects within the team but can be added to existing ones
    • Contributor - Can create new projects within the team. They can also be added to existing team projects.
    • Admin - Has complete access to all team projects, and account and billing information.
  6. Click Done.

Modifying Team Account Settings

Team administrators can modify account information, add or invite new team members, and view/edit privileges of existing members. To make these changes:
  1. From the upper right drop-down menu, switch context to the team account.
  2. Click Settings to open up the Account Settings dashboard.
    Profile
    Modify the team description on this page.
    Members
    You can add new team members on this page, and modify privilege levels for existing members.
    SSH Keys
    The team's public SSH key resides here. Team SSH keys provide a useful way to give an entire team access to external resources such as databases. For instructions, see SSH Keys. Generally, team SSH keys should not be used to authenticate against Git repositories. Use your personal key instead.

Next Steps

Once you have set up your personal or team account, the next steps are:
  1. Create a project in Cloudera Data Science Workbench. You can either create a new blank project or import an existing project. For instructions, see Managing Projects in Cloudera Data Science Workbench.
  2. Open the workbench and launch an engine to run your project. For help, see Using the Workbench Console in Cloudera Data Science Workbench
  3. (Optional) For more mature data science projects that need to run on a recurring schedule, Cloudera Data Science Workbench allows you to create jobs and pipelines. For more details on what you can accomplish by scheduling jobs and pipelines, see Managing Jobs and Pipelines in Cloudera Data Science Workbench