Cloudera named a leader in 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems Get the report

Deliver Game-Changing Data Warehouse Optimization Architecture for Today's Data-Driven Business World

Redwood City, Calif., and New York, October 29, 2013 - At the Strata Conference + Hadoop World 2013, Informatica Corporation (Nasdaq:INFA), the world’s number one independent provider of data integration software, and Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™, support services and training, today announced a jointly designed reference architecture for optimizing data warehouses for today’s data-driven business world.

The new Data Warehouse Optimization (DWO) reference architecture specifically for Enterprise Data Hub deployments addresses the challenges facing traditional data warehouse infrastructures, where capacity is too quickly consumed by increasing data volumes, leading to performance bottlenecks and costly upgrades. The DWO architecture empowers companies to optimally deploy an Enterprise Data Hub, a central system to land and work with all data in a variety of ways, together with the tools, security and governance customers require. An Enterprise Data Hub is a complementary technology to data warehouse implementations, enabling them to store and process data at any scale, to dramatically reduce data warehouse costs, and to boost developer productivity by up to a factor of five.

The proven core building blocks for implementing the DWO architecture are Cloudera Enterprise, a subscription offering that combines CDH, Cloudera’s 100 percent open source distribution of Apache Hadoop, Cloudera Manager and Cloudera Navigator and Informatica PowerCenter Big Data Edition powered by Informatica Vibe. Informatica Vibe is the world’s first and only embeddable virtual data machine (VDM), with “map once, deploy anywhere” data integration.

“Legacy environments are not going away, but they need to be augmented by Hadoop-based solutions to meet the demands of big data,” said Todd Goldman, vice president and general manager, Enterprise Data Integration, Informatica. “The Cloudera and Informatica Data Warehouse Optimization reference architecture helps companies leverage their existing environment with emerging technologies using readily available skills, so organizations can more affordably and efficiently unlock the massive potential of big data.”

Fast-growing data volumes and new types of data sources, ranging from cloud and mobile apps to social media and machine data, are placing substantial demands on current data warehouse infrastructures. To optimize their data warehouse environments, organizations are seeking ways to support unlimited data volumes while leveraging industry-standard hardware and software to reduce infrastructure costs and existing skills to minimize operational costs. They are also seeking ways to support all types of data, and easily integrate new and existing types of infrastructure.

“One of the best ways to introduce Cloudera into an organization’s data management infrastructure is to start by optimizing the data warehouse environment,” said Charles Zedlewski, vice president, Products, Cloudera. “The Cloudera and Informatica DWO reference architecture has the dual benefit of dramatically lowering costs and providing an enterprise-ready data platform that cost-effectively scales to meet the data storage and processing requirements for big data projects.”

The DWO reference architecture addresses all these requirements through the combination of Informatica and Cloudera technologies. Informatica delivers a broad and mature set of data integration and data management capabilities around Hadoop. Cloudera Enterprise enables cost-effective, scalable storage and processing on commodity infrastructure, along with enterprise-grade security, high availability, cluster management, and low-latency querying. The joint reference architecture includes technologies and solutions that:

  • Lower infrastructure and operational costs – Delivers the killer app on Cloudera, so organizations can cost-effectively scale data storage and processing on industry-standard hardware and open-source software using readily available resource skills.
  • Use existing resource skills to staff projects – Many data warehouse organizations already have ETL developers and consultants on staff trained on Informatica. With the Informatica PowerCenter Big Data Edition, every Informatica developer is now a Hadoop developer without having to become a Hadoop expert. With Informatica’s and Cloudera’s world-class support and training organizations, users can staff the development and administration of data warehouse projects on Cloudera with readily available resource skills.
  • Future proof the data warehouse and drive productivity – Informatica Vibe enables data integration and ETL processes to be written just once and deployed anywhere. This means that existing ETL processes created using Informatica’s codeless visual development paradigm can be redeployed on Cloudera Enterprise with minimal effort, resulting in a more resilient data warehouse infrastructure and an up-to-5x productivity gain for developers. Rapid development is further enhanced with Informatica’s Vibe for rapid ETL prototyping and Cloudera’s Impala for real-time interactive queries to discover insights faster.
  • Optimize data warehouse performance – Informatica PowerCenter Big Data Edition deploys on Cloudera Enterprise to load, profile, parse and transform for analysis of data in a high performance and cost-effective fashion. Optimal processing flows can be defined quickly using Informatica’s visual design interface and extensive library of pre-built transforms.
  • Handle virtually all types of data and sources – With Informatica, nearly all types of data – including legacy, ERP, CRM, social and machine – can be accessed and integrated through a variety of methods ranging from batch to replication, change data capture (CDC) and real-time streaming. Newly released Informatica Vibe Data Stream for Machine Data technology, for example, collects and streams high-volume, real-time machine data into Hadoop to drive new levels of operational intelligence.
  • Ensure data quality – Informatica Data Quality Big Data Edition executes data quality and matching rules on Cloudera Enterprise to ensure trust in the data.
  • Ensure enterprise-ready deployments that meet business SLAs – With Informatica’s Vibe, “Map Once, Deploy Anywhere”, virtual data machine technology, users can immediately deploy ETL jobs from development into production. The combination of Informatica’s unified administration and Cloudera Manager makes it easy to manage ETL workloads on Cloudera for data warehouse projects.

The Data Warehouse Optimization reference architecture from Cloudera and Informatica is available now for implementation. 

Visit Informatica at Kiosk 63 and Cloudera at Booth 403 at the Strata Conference + Hadoop World 2013, Oct. 28-30 at the New York Hilton Midtown.

Tweet this: News: @Cloudera and @InformaticaCorp Team to Optimize the #DataWarehouse


About Informatica

Informatica Corporation (Nasdaq:INFA) is the world’s number one independent provider of data integration software. Organizations around the world rely on Informatica to realize their information potential and drive top business imperatives. Informatica Vibe, the industry’s first and only embeddable virtual data machine (VDM), powers the unique “Map Once. Deploy Anywhere.” capabilities of the Informatica Platform. Worldwide, over 5,000 enterprises depend on Informatica to fully leverage their information assets from devices to mobile to social to big data residing on-premise, in the Cloud and across social networks. For more information, call +1 650-385-5000 (1-800-653-3871 in the U.S.), or visit Connect with Informatica at and

About Cloudera

Cloudera is revolutionizing enterprise data management by offering the first unified Platform for big data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Cloudera's open source big data platform is the most widely adopted in the world, and Cloudera is the most prolific contributor to the open source Hadoop ecosystem. As the leading educator of Hadoop professionals, Cloudera has trained over 40,000 individuals worldwide. Over 1,700 partners and a seasoned professional services team help deliver greater time to value. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production.

Connect With Cloudera

Learn more about Cloudera:
Read our blog:
Follow us on Twitter:
Get updates on LinkedIn:
Visit us on Facebook:
See us on YouTube:
Join the Cloudera Community:
Read about our customers' successes:

Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Editionand CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Press Inquiries

Deborah Wiltshire

Keep in touch:

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.