Configuring Encryption

Encryption is a process that uses digital keys to encode various components—text, files, databases, passwords, applications, or network packets, for example—so that only the appropriate entity (user, system process, and so on) can decode (decrypt) the item and view, modify, or add to the data. For Cloudera CDH components, encryption can be applied at various layers of the cluster, as shown in the table:

Layer Description
Application Applied by the HDFS client software, HDFS Transparent Encryption lets you encrypt specific folders contained in HDFS. To securely store the required encryption keys, Cloudera recommends using Cloudera Navigator Key Trustee Server in conjunction with HDFS encryption. See Enabling HDFS Encryption Using Cloudera Navigator Key Trustee Server for details.

Data stored temporarily on the local filesystem outside HDFS by CDH components (including Impala, MapReduce, YARN, or HBase) can also be encrypted. See Configuring Encryption for Data Spills for details.

Operating System At the Linux OS filesystem layer, encryption can be applied to an entire volume. For example, Cloudera Navigator Encrypt can encrypt data inside and outside HDFS, such as temp/spill files, configuration files, and databases that store metadata associated with a CDH cluster. Navigator Encrypt requires a license for Cloudera Navigator and must be configured to use Navigator Key Trustee Server.
Network Network communications between client processes and server processes (HTTP, RPC, or TCP/IP services) can be encrypted using industry-standard TLS/SSL as detailed in TLS/SSL Overview. See How to Configure TLS Encryption for Cloudera Manager for a step-by-step configuration guide.

This section shows you how to configure the various types of encryption for a Cloudera cluster, starting with network encryption over TLS/SSL. Later sections show you how to configure data-at-rest encryption using two complementary mechanisms, HDFS Transparent Encryption and Cloudera Navigator Encrypt.