Enabling TLS/SSL for Cloudera Data Science Workbench

Cloudera Data Science Workbench uses HTTP and WebSockets (WS) to support interactive connections to the Cloudera Data Science Workbench web application. However, these connections are not secure by default. This topic describes how you can use TLS/SSL to secure connections between your browser and the Cloudera Data Science Workbench web application.

Transport Layer Security (TLS) is an industry standard set of cryptographic protocols for securing communications over a network. TLS evolved from Secure Sockets Layer (SSL, which remains part of the name for historical reasons). TLS/SSL provides privacy and data integrity between applications communicating over a network by encrypting the packets transmitted between endpoints.

You can use TLS/SSL to enforce secure encrypted connections, using HTTPS and WSS (WebSockets over TLS), to the Cloudera Data Science Workbench web application. Specifically, Cloudera Data Science Workbench can be configured to use a TLS termination proxy to handle incoming connection requests. The termination proxy server will decrypt incoming connection requests and forwards them to the Cloudera Data Science Workbench web application.

A TLS termination proxy can be internal or external. An internal termination proxy will be run by Cloudera Data Science Workbench's built-in load balancer, called the ingress controller, on the master node. The ingress controller is primarily responsible for routing traffic and load balancing between Cloudera Data Science Workbench's web service backends. Once configured, as shown in the following instructions, it will start terminating HTTPS traffic as well. External termination can be done at an external load balancer such as the AWS Elastic Load Balancer.

Continue reading:

Certificate Requirements

  • The TLS certificate must list both, the Cloudera Data Science Workbench DOMAIN (set in cdsw.conf), as well as a wildcard for all first-level subdomains. For example, if DOMAIN is set to cdsw.company.com, then the TLS certificate must include both cdsw.company.com and *.cdsw.company.com. To verify this, run the following command and ensure that both domains are listed under X509v3 Subject Alternative Name.
    openssl x509 -in <your_tls_cert>.crt -noout -text
  • Many browsers no longer recognize SHA1 certificate signatures. If your certificate is signed by an internal Certificate Authority, make sure you use at least SHA-256. You can verify this in the output of the previous openssl command, under the Signature Algorithm field. For SHA-256, the value under Signature Algorithm will be sha256WithRSAEncryption.

Internal Termination

Internal TLS termination must be configured during the installation process and is governed by the following variables in cdsw.conf.
  • TLS_ENABLE - When set to true, this property enforces HTTPS and WSS connections. The server will now redirect any HTTP request to HTTPS and generate URLs with the appropriate protocol.
  • TLS_KEY - Set to the path of the TLS private key.
  • TLS_CERT - Set to the path of the TLS certificate.

    Certificates and keys must be in PEM format.

External Termination

External TLS termination must be configured during the installation process and is governed by the TLS_ENABLE variable in cdsw.conf.
  • TLS_ENABLE - When set to true, this property enforces HTTPS and WSS connections. The server will now redirect any HTTP request to HTTPS and generate URLs with the appropriate protocol.

    The TLS_KEY and TLS_CERT properties must be left blank.

Many load balancers and proxies require an URL they can ping to validate the status of the web service backend. For instance, you can configure a load balancer to send an HTTP GET request to /internal/load-balancer/health-ping. If the response is 200 (OK), that means the backend is healthy. Note that, as with all communication to the web backend from the load balancer when TLS is terminated externally, this request should be sent over HTTP and not HTTPS.

Limitations

  • Communication within the Cloudera Data Science Workbench cluster is not encrypted.

  • Troubleshooting can be difficult because browsers do not typically display helpful security errors with WebSockets. Often they will just silently fail to connect.

  • In general, browsers do not support self-signed certificates for WSS. Your certificate must be signed by a Certificate Authority (CA) that your users’ browsers will trust. Cloudera Data Science Workbench will not function properly if browsers silently abort WebSockets connections.

    If you need to use a certificate signed by your organization's internal CA, make sure that all your users import your root CA certificate into their machine’s trust store. This can be done using the Keychain Access application on Macs or the Microsoft Management Console on Windows.

    If your browser asks if you want to trust the certificate provided by Cloudera Data Science Workbench, that means you are using a self-signed certificate, and WSS connections will likely be aborted silently, regardless of your response to the dialog.