How to Configure AWS Credentials

Minimum Required Role: User Administrator (also provided by Full Administrator)

Amazon S3 (Simple Storage Service) can be used in a CDH cluster managed by Cloudera Manager in the following ways:
  • As storage for Impala tables
  • As a source or destination for HDFS replication and for cluster storage
  • To enable Cloudera Navigator to extract metadata from Amazon S3 storage
  • To browse S3 data using Hue
To provide access to Amazon S3, you configure AWS Credentials that specify the authentication type (role-based, for example) and the access and secret keys. Amazon offers two types of authentication you can use with Amazon S3:
IAM Role-based Authentication

Amazon Identity & Access Management (IAM) gives clients of a given role the same access to all data. All jobs on the cluster will have the same level of access to Amazon S3, so this is better suited for environments where there is a single user, or where all users of a cluster should have the same privileges to data in Amazon S3.

If you are setting up a peer to copy data to and from Amazon S3, using Cloudera Manager Hive or HDFS replication, select this option.

If you are configuring S3 access for a cluster deployed on AWS, and you have assigned the AWS host profile to the AWS cluster hosts, you do not need configure IAM authentication on this screen for services such as Impala, Hive, or Spark.

Access Key Authentication
This type of authentication requires an AWS Access Key and an AWS Secret key that you obtain from Amazon and is better suited for environments where you have multiple users or multi-tenancy. Enabling the Sentry service and Kerberos are required when using the S3 Connector service and allows you to configure selective access for different data paths. (The Sentry service is not required for BDR replication or access by Cloudera Navigator.)

Cloudera Manager stores these values securely and does not store them in world-readable locations. The credentials are masked in the Cloudera Manager Admin console, encrypted in the configurations passed to processes managed by Cloudera Manager, and redacted from the logs.

For more information about Amazon S3, see the Amazon S3 documentation.

The client configuration files generated by Cloudera Manager based on configured services do not include AWS credentials. These clients must manage access to these credentials outside of Cloudera Manager. Cloudera Manager uses credentials stored in Cloudera Manager for trusted clients such as the Impala daemon and Hue. For access from YARN, MapReduce or Spark, see Using S3 Credentials with YARN, MapReduce, or Spark.

Adding AWS Credentials

Minimum Required Role: User Administrator (also provided by Full Administrator)

To add AWS Credentials for Amazon S3:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > AWS Credentials.
  3. Click Add and select one of the following:
    • Access Key Authentication

      This authentication mechanism requires you to obtain AWS credentials from Amazon.

      1. Enter a Name of your choosing for this account.
      2. Enter the AWS Access Key ID.
      3. Enter the AWS Secret Key.
    • IAM Role-Based Authentication

      If you are setting up a peer to copy data to and from Amazon S3, using Cloudera Manager Hive or HDFS replication, select this option.

      If you are configuring S3 access for a cluster deployed on AWS, and you have assigned the AWS host profile to the AWS cluster hosts, you do not need configure IAM authentication on this screen for services such as Impala, Hive, or Spark.

  4. Click Save.

    The Connect to Amazon Web Services screen displays.

  5. Choose one of the following options:
    • To configure Amazon S3 as the source or destination of a replication schedule (to back up and restore data, for example), click the Replication Schedules link. See Data Replication for details.
    • To enable cluster access to S3 using the S3 Connector Service, click the Enable for Cluster Name link, which launches a wizard for adding the S3 Connector service. See Adding the S3 Connector Service for details.
    • To give Cloudera Navigator access to Amazon S3, click the Enable for Cloudera Navigator link. Restart the Cloudera Navigator Metadata server to enable access.

Managing AWS Credentials

To remove AWS credentials:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > AWS Credentials.
  3. Locate the row with the credentials you want to delete and click Actions > Remove.
To edit AWS credentials:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > AWS Credentials.
  3. Locate the row with the credentials you want to delete and click Actions > Edit Account Details.

    The Edit Account screen displays.

  4. Edit the account fields.
  5. Click Save.
  6. Restart cluster services that use these credentials. If connectivity is for Cloudera Navigator, Restart the Cloudera Navigator Metadata server.
To edit the services connected to an AWS Credentials account:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > AWS Credentials.
  3. Locate the row with the credentials you want to edit and click Actions > Edit Connectivity.

    The Connect to Amazon Web Services screen displays.

  4. Click one of the following options:
    • To configure Amazon S3 as the source or destination of a replication schedule (to back up and restore data, for example), click the Replication Schedules link. See Data Replication for details.
    • To enable cluster access to S3 using the S3 Connector Service, click the Enable for Cluster Name link, which launches a wizard for adding the S3 Connector service. See Adding the S3 Connector Service for details.
    • To give Cloudera Navigator access to Amazon S3, click the Enable for Cloudera Navigator link. Restart the Cloudera Navigator Metadata server to enable access.