HttpFS Authentication

This section describes how to configure HttpFS CDH 5 with Kerberos security on a Hadoop cluster:

For more information about HttpFS, see https://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-hdfs-httpfs/index.html.

Configuring the HttpFS Server to Support Kerberos Security

  1. Create an HttpFS service user principal that is used to authenticate with the Hadoop cluster. The syntax of the principal is: httpfs/<fully.qualified.domain.name>@<YOUR-REALM> where: fully.qualified.domain.name is the host where the HttpFS server is running YOUR-REALM is the name of your Kerberos realm
    kadmin: addprinc -randkey httpfs/fully.qualified.domain.name@YOUR-REALM.COM
  2. Create a HTTP service user principal that is used to authenticate user requests coming to the HttpFS HTTP web-services. The syntax of the principal is: HTTP/<fully.qualified.domain.name>@<YOUR-REALM> where: 'fully.qualified.domain.name' is the host where the HttpFS server is running YOUR-REALM is the name of your Kerberos realm
    kadmin: addprinc -randkey HTTP/fully.qualified.domain.name@YOUR-REALM.COM
  3. Create keytab files with both principals.
    $ kadmin
    kadmin: xst -k httpfs.keytab httpfs/fully.qualified.domain.name
    kadmin: xst -k http.keytab HTTP/fully.qualified.domain.name
  4. Merge the two keytab files into a single keytab file:
    $ ktutil
    ktutil: rkt httpfs.keytab
    ktutil: rkt http.keytab
    ktutil: wkt httpfs-http.keytab
  5. Test that credentials in the merged keytab file work. For example:
    $ klist -e -k -t httpfs-http.keytab
  6. Copy the httpfs-http.keytab file to the HttpFS configuration directory. The owner of the httpfs-http.keytab file should be the httpfs user and the file should have owner-only read permissions.
  7. Edit the HttpFS server httpfs-site.xml configuration file in the HttpFS configuration directory by setting the following properties:

    Property

    Value

    httpfs.authentication.type

    kerberos

    httpfs.hadoop.authentication.type

    kerberos

    httpfs.authentication.kerberos.principal

    HTTP/<HTTPFS-HOSTNAME>@<YOUR-REALM.COM>

    httpfs.authentication.kerberos.keytab

    /etc/hadoop-httpfs/conf/httpfs-http.keytab

    httpfs.hadoop.authentication.kerberos.principal

    httpfs/<HTTPFS-HOSTNAME>@<YOUR-REALM.COM>

    httpfs.hadoop.authentication.kerberos.keytab

    /etc/hadoop-httpfs/conf/httpfs-http.keytab

    httpfs.authentication.kerberos.name.rules

    Use the value configured for 'hadoop.security.auth_to_local' in 'core-site.xml'

Using curl to access an URL Protected by Kerberos HTTP SPNEGO

To configure curl to access an URL protected by Kerberos HTTP SPNEGO:

  1. Run curl -V:
    $ curl -V
    curl 7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l
    zlib/1.2.3
    Protocols: tftp ftp telnet dict ldap http file https ftps
    Features: GSS-Negotiate IPv6 Largefile NTLM SSL libz
  2. Login to the KDC using kinit.
    $ kinit
    Please enter the password for tucu@LOCALHOST:
  3. Use curl to fetch the protected URL:
    $ curl --cacert /path/to/truststore.pem --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt https://localhost:14000/webhdfs/v1/?op=liststatus
    where:
    • The --cacert option is required if you are using TLS/SSL certificates that curl does not recognize by default.
    • The --negotiate option enables SPNEGO in curl.
    • The -u : option is required but the username is ignored (the principal that has been specified for kinit is used).
    • The -b and -c options are used to store and send HTTP cookies.
    • Cloudera does not recommend using the -k or --insecure option as it turns off curl's ability to verify the certificate.