Cloudera and Informatica
Informatica’s high performance, universal connectivity now enables companies to get most any data into and out of Apache Hadoop. Companies can leverage Informatica for pre- and post-processing of the data – including parsing and transforming structured and unstructured data, data cleansing, identity resolution, data masking and master data management.
About Informatica
Informatica Corporation is the world's number one independent provider of data integration software. Organizations around the world gain a competitive advantage in today's global information economy with timely, relevant and trustworthy data for their top business imperatives. More than 4,440 enterprises worldwide rely on Informatica to access, integrate and trust their information assets held in the traditional enterprise, off premise and in the Cloud.
The Informatica and Cloudera Joint Solution
Universal data access
- Maximize the business value from Big Data at any volume, velocity and variety.
- Ensure repeatability and maintainability with native data access to all of your data.
- Promote flexibility in choosing where to keep source data between Hadoop and other systems.
- Virtualize access to any data in and out of Hadoop
Data parsing and exchange
- Reduce development time and maintenance costs of parsing and transforming any data formats with a visual design environment.
- Define and deploy any parser on Hadoop for complex and unstructured data sources.
Data quality and governance
- Promote trust and governance in reporting and analytics with Hadoop.
- Increase efficiency in uncovering and fixing data characteristics and data quality issues.
- Improve collaboration between analysts and developers through data profiling and data virtualization.
Metadata management and auditability
- Align business and IT through a common taxonomy of metadata definitions including business terms and technical definitions.
- Shorten project delivery with end-to-end personalized lineage and reduce risk through data audits and monitoring.
Processing in Hadoop
- Take advantage of the high data processing power for transformation logic.
- Decouple data processing logic to design once and deploy anywhere in a hybrid IT environment.
High throughput data provisioning
- Meet data service levels that require years of data to micro-seconds using Hadoop and other enterprise systems.
- Ensure up-to-date data delivery between Hadoop and the rest of the enterprise: batch, mini-batch, change data, streaming data, and message data.
- Improve data processing by efficient partitioning and redistribution of data in Hadoop with data profiling.