Designating a Replication Source

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

The Cloudera Manager Server that you are logged into is the destination for replications set up using that Cloudera Manager instance. From the Admin Console of this destination Cloudera Manager instance, you can designate a peer Cloudera Manager Server as a source of HDFS and Apache Hive data for replication.

Configuring a Peer Relationship

  1. Go to the Peers page by selecting Backup > Peers. If there are no existing peers, you will see only an Add Peer button in addition to a short message. If peers already exist, they display in the Peers list.
  2. Click the Add Peer button.
  3. In the Add Peer dialog box, provide a name, the URL (including the port) of the Cloudera Manager Server source for the data to be replicated, and the login credentials for that server. Cloudera recommends that TLS/SSL be used. A warning is shown if the URL scheme is http instead of https. After configuring both peers to use TLS/SSL, add the remote source Cloudera Manager TLS/SSL certificate to the local Cloudera Manager truststore, and vice versa. See Configuring TLS (Encryption Only) for Cloudera Manager.
  4. Click the Add Peer button in the dialog box to create the peer relationship.

    The peer is added to the Peers list. Cloudera Manager automatically tests the connection between the Cloudera Manager Server and the peer. You can also click Test Connectivity to test the connection.

Modifying Peers

  1. Go to the Peers page by selecting Backup > Peers. If there are no existing peers, you will see only an Add Peer button in addition to a short message. If peers already exist, they display in the Peers list.
  2. Do one of the following:
    • Edit
      1. In the row for the peer, select Edit.
      2. Make your changes.
      3. Click Update Peer to save your changes.
    • Delete - In the row for the peer, click Delete.

Configuring an External Account for Amazon S3 Replication

To configure Amazon S3 as a source or destination for HDFS replication, configure an External Account that specifies the type of authentication to use. Amazon offers two types of authentication you can use with Amazon S3:
  • IAM Authentication. This type of authentication presumes that clients can have access to all of the data.
  • Access Key / Secret Key. This type of authentication is better suited for environments where you have multiple users or multi-tenancy. You can configure selective access for different data paths using this authentication.

For more information, see the Amazon S3 documentation.

To configure an External Account for Amazon S3:
  1. Go to Administration > External Accounts.
  2. Click the Add button and select one of the following:
    • Access Key Authentication

      This authentication mechanism requires you to obtain AWS credentials.

      1. Enter a Name of your choosing for this account.
      2. Enter the AWS Access Key ID.
      3. Enter the AWS Secret Key.
    • IAM Role-Based Authentication

      Select this option to use Amazon IAM role-base authentication.

Configuring Peers with SAML Authentication

If your cluster uses SAML Authentication, do the following before creating a peer:
  1. Create a Cloudera Manager user account that has the User Administrator or Full Administrator role.

    You can also use an existing user that has one of these roles. Since you will only use this user to create the peer relationship, you can delete the user account after adding the peer.

  2. Create or modify the peer, as described in this topic.
  3. (Optional) Delete the Cloudera Manager user account you just created.