Configuring Cloudera Data Science Workbench Deployments Behind a Proxy
Depending on your deployment, use one of the following methods to configure the proxy in Cloudera Science Workbench:
- CSD - Set the HTTP Proxy or HTTPS Proxy properties in the Cloudera Manager's CDSW service.
- RPM - Set the HTTP_PROXY or HTTPS_PROXY properties in /etc/cdsw/config/cdsw.conf on all Cloudera Data Science Workbench gateway hosts.
Supporting a TLS-Enabled Proxy Server:
If the proxy server uses TLS encryption to handle connection requests, you will need to add the proxy's root CA certificate to your host's store of trusted certificates. This is because proxy servers typically sign their server certificate with their own root certificate. Therefore, any connection attempts will fail until the Cloudera Data Science Workbench host trusts the proxy's root CA certificate. If you do not have access to your proxy's root certificate, contact your Network / IT administrator.
Copy the proxy's root certificate to the trusted CA certificate store (ca-trust) on the Cloudera Data Science Workbench host.
cp /tmp/<proxy-root-certificate>.crt /etc/pki/ca-trust/source/anchors/
Use the following command to rebuild the trusted certificate store.
If you will be using custom engine images that will be pulled from a Docker repository, add the proxy's root certificates to a directory under /etc/docker/certs.d. For example, if your Docker repository is at docker.repository.mycompany.com, create the following directory structure:
/etc/docker/certs.d |-- docker.repository.mycompany.com # Directory named after Docker repository |-- <proxy-root-certificate>.crt # Docker-related root CA certificates
This step is not required with the standard engine images because they are included in the Cloudera Data Science Workbench RPM.
Re-initialize Cloudera Data Science Workbench to have this change go into effect.
Configure Hostnames to be Skipped from the Proxy
Starting with version 1.4, if you have defined a proxy in the HTTP_PROXY(S) or ALL_PROXY properties, Cloudera Data Science Workbench automatically appends the following list of IP addresses to the NO_PROXY configuration. Note that this is the minimum required configuration for this field.
"127.0.0.1,localhost,100.66.0.1,100.66.0.2,100.66.0.3, 100.66.0.4,100.66.0.5,100.66.0.6,100.66.0.7,100.66.0.8,100.66.0.9, 100.66.0.10,100.66.0.11,100.66.0.12,100.66.0.13,100.66.0.14, 100.66.0.15,100.66.0.16,100.66.0.17,100.66.0.18,100.66.0.19, 100.66.0.20,100.66.0.21,100.66.0.22,100.66.0.23,100.66.0.24, 100.66.0.25,100.66.0.26,100.66.0.27,100.66.0.28,100.66.0.29, 100.66.0.30,100.66.0.31,100.66.0.32,100.66.0.33,100.66.0.34, 100.66.0.35,100.66.0.36,100.66.0.37,100.66.0.38,100.66.0.39, 100.66.0.40,100.66.0.41,100.66.0.42,100.66.0.43,100.66.0.44, 100.66.0.45,100.66.0.46,100.66.0.47,100.66.0.48,100.66.0.49, 100.66.0.50,100.77.0.10,100.77.0.128,100.77.0.129,100.77.0.130, 100.77.0.131,100.77.0.132,100.77.0.133,100.77.0.134,100.77.0.135, 100.77.0.136,100.77.0.137,100.77.0.138,100.77.0.139"
This list includes 127.0.0.1, localhost, and any private Docker registries and HTTP services inside the firewall that Cloudera Data Science Workbench users might want to access from the engines.
To configure any additional hostnames that should be skipped from the proxy, use one of the following methods depending on your deployment:
On a CSD deployment, use the Cloudera Manager CDSW service's No Proxy property to specify a comma-separated list of hostnames.
On an RPM deployment, configure the NO_PROXY field in cdsw.conf on all Cloudera Data Science Workbench hosts.