Copying Data between a Secure and an Insecure Cluster using DistCp and WebHDFS

You can use DistCp and WebHDFS to copy data between a secure cluster and an insecure cluster. Note that when doing this, the distcp commands should be run from the secure cluster. by doing the following:
  1. On the secure cluster, set ipc.client.fallback-to-simple-auth-allowed to true in core-site.xml:
    <property>
      <name>ipc.client.fallback-to-simple-auth-allowed</name>
      <value>true</value> 
    </property>
  2. On the insecure cluster, add the secured cluster's realm name to the insecure cluster's configuration:
    1. In the Cloudera Manager Admin Console for the insecure cluster, navigate to Clusters > <HDFS cluster>.
    2. On the Configuration tab, search for Trusted Kerberos Realms and add the secured cluster's realm name.

      Note that his does not require Kerberos to be enabled but is a necessary step to allow the simple auth fallback to happen in the hdfs:// protocol.

    3. Save the change.
  3. Use commands such as the following from the secure cluster side only:
    distcp webhdfs://insecureCluster webhdfs://secureCluster 
    distcp webhdfs://secureCluster webhdfs://insecureCluster