Cloudera Data Science Workbench Release Notes

These release notes provide information on new features, fixed issues and incompatible changes for all generally-available (GA) versions of Cloudera Data Science Workbench. For the current known issues and limitations, see Known Issues and Limitations in Cloudera Data Science Workbench 1.1.x.

Cloudera Data Science Workbench 1.1.1

This section lists the new features, fixed issues, and incompatible changes included in Cloudera Data Science Workbench 1.1.1.

New Features in Cloudera Data Science Workbench 1.1.1

  • Keytab Authentication - With version 1.1.1, you can now authenticate yourself to the CDH cluster by uploading your Kerberos keytab to Cloudera Data Science Workbench. To use this feature, go to the top-right dropdown menu, click Account settings > Hadoop Authentication, enter your Kerberos principal and click Upload Keytab.

Issues Fixed In Cloudera Data Science Workbench 1.1.1

  • Fixed an issue with airgapped installations where the installer could not pull the alpine 3.4 image into the airgapped environment.
  • Fixed an issue where Cloudera Data Science Workbench would fail to log a command trace when the Kerberos process exits.
  • Fixed authentication issues with older versions of MIT KDC.

Known Issues and Limitations in Cloudera Data Science Workbench 1.1.1

For a list of the current known issues and limitations in Cloudera Data Science Workbench 1.1.x, see Known Issues and Limitations in Cloudera Data Science Workbench 1.1.x

Cloudera Data Science Workbench 1.1.0

This section lists the new features, fixed issues, and incompatible changes in Cloudera Data Science Workbench 1.1.0.

New Features and Changes in Cloudera Data Science Workbench 1.1.0

  • Added support for RHEL/CentOS 7.3 and Oracle Linux 7.3.

  • Cloudera Data Science Workbench now allows you to run GPU-based workloads. For more details, see Using GPUs for Cloudera Data Science Workbench Workloads.

  • For Cloudera Manager and CDH clusters that are not connected to the Internet, Cloudera Data Science Workbench now supports fully offline installations. See the installation guide for more details.

  • Web UIs for processing frameworks such as Spark 2, Tensorflow, and Shiny, are now embedded in Cloudera Data Science Workbench and can be accessed directly from active sessions and jobs. For more details, see Accessing Web User Interfaces from Cloudera Data Science Workbench.

  • Added support for a Jobs REST API that lets you orchestrate jobs from 3rd party workflow tools. See Cloudera Data Science Workbench Jobs API.

  • DataFrames are now scrollable in the workbench session output pane. For examples, see the section on Grid Displays.

  • Cloudera Data Science Workbench now ships a new version of the Python engine image that includes newer versions of Pandas, seaborn, and assorted bug fixes.

  • Added support for rich visualizations in Scala engine using Jupyter jvm-repr. For an example, see HTML Visualizations - Scala.

  • JAVA_HOME is now set in cdsw.conf, and not from the Site Administrator dashboard (Admin > Engines).

Issues Fixed in Cloudera Data Science Workbench 1.1.0

  • Improved support for dynamic data visualizations in Python, including Bokeh.

  • Fixed issues with the Python template project. The project now supports offline mode and will therefore work on airgapped clusters.

  • Fixed issues related to cached responses in Internet Explorer 11.

  • Fixed issues with Java symlinks outside of JAVA_HOME.

  • The cdsw status command can now be run on worker nodes.

  • Removed unauthenticated localhost access to Kubernetes.

  • Fixed Kerberos authentication issues with specific enc-types and Active Directory.

  • Removed restrictions on usernames with special characters for better compatibility with external authentication systems such as Active Directory.

  • Fixed issues with LDAP configuration validation that caused application crashes.

  • Improved LDAP test configuration form to avoid confusion on parameters being sent.

Incompatible Changes in Cloudera Data Science Workbench 1.1.0

  • Upgrading from version 1.0.x to 1.1.x

    During the upgrade process, you will encounter incompatibilities between the two versions of cdsw.conf. This is because even though you are installing the latest RPM, your previous configuration settings in cdsw.conf will remain unchanged. Depending on the release you are upgrading from, you will need to modify cdsw.conf to ensure it passes the validation checks run by the 1.1.x release.

    Key changes to note:
    • JAVA_HOME is now a required parameter. Make sure you add JAVA_HOME to cdsw.conf before you start Cloudera Data Science Workbench.
    • Previous versions allowed MASTER_IP to be set to a DNS hostname. If you are still using a DNS hostname, switch to an IP address.
  • Python engine updated in version 1.1.x

    Version 1.1.x includes an updated base engine image for Python which no longer uses the deprecated pylab mode in Jupyter to import the numpy and matplotlib functions into the global scope. With version 1.1.x, engines will now use built-in functions like any rather than the pylab counterpart, numpy.any. As a result of this change, you might see certain behavioral changes and differences in results between the two versions.

    Also note that Python projects originally created with engine 1 will be running pandas version 0.19, and will not auto-upgrade to version 0.20 by simply selecting engine 2. You will also need to manually install version 0.20.1 of pandas when you launch a project session.

Known Issues and Limitations in Cloudera Data Science Workbench 1.1.0

For a list of the current known issues and limitations in Cloudera Data Science Workbench 1.1.x, see Known Issues and Limitations in Cloudera Data Science Workbench 1.1.x

Cloudera Data Science Workbench 1.0.1

This section lists the release notes for Cloudera Data Science Workbench 1.0.1. The documentation for version 1.0.x can be found at Cloudera Data Science Workbench 1.0.x.

Issues Fixed in Cloudera Data Science Workbench 1.0.1

  • Fixed a random port conflict that could prevent Scala engines from running.

  • Improved formatting of validation, and visibility of some errors.

  • Fixed an issue with Firefox that was resulting in duplicate jobs on job creation.

  • Removed the Mathjax external dependency on CDN.

  • Improved PATH and JAVA_HOME handling that previously broke Hadoop CLIs.

  • Fixed an issue with Java security policy files that caused Kerberos issues.

  • Fixed an issue that caused git clone to fail on some repositories.

  • Fixed an issue where updating LDAP admin settings deactivated the local fallback login.

  • Fixed an issue where bad LDAP configuration crashed the application.

  • Fixed an issue where job environmental variable settings did not persist.

Known Issues and Limitations in Cloudera Data Science Workbench 1.0.x

For a list of known issues and limitations, refer the documentation for version 1.0.x at Cloudera Data Science Workbench 1.0.x.

Cloudera Data Science Workbench 1.0.0

Version 1.0 represents the first generally available (GA) release of Cloudera Data Science Workbench. For information about the main features and benefits of Cloudera Data Science Workbench, as well as an architectural overview of the product, see About Cloudera Data Science Workbench.