Sqoop Key Features
Efficiently move structured data:
Easily import and export bulk data between Hadoop and structured datastores (such as a data warehouse, relational database, or NoSQL systems). Parallel processing ensures data is efficiently transferred and stored in Hadoop for unified analytics, and resulting structures can be transferred back to optimize existing operational workloads — making data quickly available to a wide range of users and workloads.
Connect to existing systems:
Leverage pre-built connectors to easily integrate with the most popular external systems, including MySQL, PostgreSQL, Teradata, Netezza, and Oracle. All connectors are freely available and ready for production use so you can reliably bring existing, structured data into Hadoop to offload certain workloads and combine them with other data types for new insights.
Unified data for analytics:
Combine structured data with unstructured and semi-structured data in a single, flexible platform to discover powerful new insights and enhance existing analytics. You can easily bring unlimited amounts of structured data into the platform, store it alongside other data types, and share it with users across the company — whether they want to optimize batch processing, build machine-learning models, or anything in between.
Common Use Cases
As the standard tool for bringing structured data into Hadoop, Sqoop is a critical component for building a variety of end-to-end workloads to analyze unlimited data of any type. Typical use cases include:
Active archive
Optimized data processing
- Ad hoc data discovery
Integrated across the platform
As an integrated part of Cloudera’s platform, Sqoop can easily work with other components, such as Apache Hive and Impala, to make data easily accessible all within a single platform. It also benefits from unified resource management (through YARN), simple deployment and administration (through Cloudera Manager), and shared compliance-ready security and governance (through Apache Sentry and Cloudera Navigator) — all critical for running in production.
Cloudera's commitment to Sqoop
Cloudera, the original developer of Sqoop, is actively involved with the Sqoop community, with committers on-staff to continue to drive Sqoop innovations. As a deeply integrated part of the platform, Cloudera has built-in critical production-ready capabilities, especially around scalability and administrative ease, helping to solidify Sqoop’s place as an open standard for Hadoop.
Cloudera’s engineering expertise, combined with Support experience with large-scale production customers, means you get direct access and influence to the roadmap based on your needs and use cases.
Partnered with the ecosystem
Seamlessly integrate with the tools your business already uses by leveraging Cloudera’s 1,700+ partner ecosystem. With a robust partner certification program, we are continuously working to build out production-hardened integrations between Sqoop and the most popular structured datastores.
Expert support for Sqoop
Trained by its creators, Cloudera has Sqoop experts available across the globe ready to deliver world-class support 24/7. With more experience across more production customers, for more use cases, Cloudera is the leader in Sqoop support so you can focus on results.