This is the documentation for Cloudera 5.4.x. Documentation for other versions is available at Cloudera Documentation.

Sqoop 2 Installation

Sqoop 2 is a server-based tool designed to transfer data between Hadoop and relational databases. You can use Sqoop 2 to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data with Hadoop MapReduce, and then export it back into an RDBMS.

  Note:

Moving from Sqoop 1 to Sqoop 2: Sqoop 2 is essentially the future of the Apache Sqoop project. However, since Sqoop 2 currently lacks some of the features of Sqoop 1, Cloudera recommends you use Sqoop 2 only if it contains all the features required for your use case; otherwise, continue to use Sqoop 1.

There are three packaging options for installing Sqoop 2:

  • Tarball (.tgz) that contains both the Sqoop 2 server and the client.
  • Separate RPM packages for Sqoop 2 server (sqoop2-server) and client (sqoop2-client)
  • Separate Debian packages for Sqoop 2 server (sqoop2-server) and client (sqoop2-client)

These topics describe the steps to install Sqoop 2.

Page generated August 17, 2015.