Integration of Cloudera’s Distribution for Hadoop With Greenplum Provides New Opportunities for Analysis of Structured and Complex Data
PALO ALTO, CA – September 22, 2010 – Cloudera, a leading provider of Hadoop-based data management software and services, and EMC Data Computing Division announced an alliance that will enable the integration of Cloudera’s Distribution for Hadoop (CDH) and Greenplum technology. The integration between CDH for collecting, consolidating and analyzing data with EMC Greenplum’s massively parallel processing database and enterprise data cloud platform will provide a robust architecture for collaborative analysis of large amounts of structured (i.e. online databases) and unstructured (i.e. log files, sensor data, documents) data.
As part of the alliance, Cloudera will build a connector between Cloudera’s Distribution for Hadoop and Greenplum technologies. The connector will enable high-speed bi-directional data transfer between the systems and will be jointly supported by both Cloudera and Greenplum. Additionally the Greenplum sales team will be trained on Cloudera’s suite of Apache Hadoop based products and services.
The alliance between EMC Greenplum and Cloudera will change the way customers collect, process and store data. Today, customers use a combination of database and archive storage products to collect, process and store complex and structured data. They are required to shuttle the data between systems, transforming and structuring it before they can analyze it. As data volumes and types grow, there is no single place to store and process all of this data.
Hadoop is becoming an increasingly popular solution to this problem. Customers are able to easily stage their data in a single Hadoop-based repository, leveraging its ability to inexpensively store both complex and structured data. They can then iterate over data using MapReduce to process and analyze the data, create meta-data layers, and transform the data for loading into a Greenplum database. Additionally, customers can combine long-term historical and new data enabling deeper insight and the detection of patterns not visible over short time periods.
“Together EMC and Cloudera have a real opportunity to help companies change the way they collect, process and store data,” said Michael Olson, CEO of Cloudera. “Organizations can use CDH to inexpensively capture complex and structured data, while Greenplum Chorus utilizes its cloud-based platform to discover data from a variety of sources and enables collaborative analysis for end users.”
“EMC is building the data system of the future, a system that brings together all of your data, all of your tools, and all of your people,” said Bill Cook, President and General Manager of EMC’s Data Computing Division. “EMC and Cloudera represent a powerful combination of what we can deliver to customers. By bringing together our solutions, our customers have a powerful tool for collaborative data analysis and can more quickly and effectively analyze data from a variety of sources.”
CDH is the most comprehensive and broadly adopted Hadoop-based platform on the market, lowering the barrier to Hadoop adoption by making it simple to install and easy to integrate into the data center. It consists of core Apache Hadoop and eight additional open source projects, all tested and integrated into a single platform, making it the most complete Hadoop-based distribution. For more information about CDH, visit http://www.cloudera.com/hadoop/.
EMC will be exhibiting and presenting on its relationship with Cloudera at the annual Hadoop World conference taking place in New York City on October 12. Attend Hadoop World 2010 for additional examples of Hadoop in the enterprise.
Cloudera is revolutionizing enterprise data management by offering the first unified Platform for big data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Cloudera's open source big data platform is the most widely adopted in the world, and Cloudera is the most prolific contributor to the open source Hadoop ecosystem. As the leading educator of Hadoop professionals, Cloudera has trained over 40,000 individuals worldwide. Over 1,700 partners and a seasoned professional services team help deliver greater time to value. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production.
Connect With Cloudera
About Cloudera: cloudera.com/about-cloudera.html
Read our Engineering blog: blog.cloudera.com/
Follow us on Twitter: twitter.com/cloudera
Visit us on Facebook: facebook.com/cloudera
See us on YouTube: youtube.com/user/clouderahadoop
Join the Cloudera Community: community.cloudera.com/
Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Editionand CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.