DataFlow: Collect flows, streams, and analytics for the lifecycle
Easily ingest, route, manage, and deliver data-at-rest and data-in-motion from the Edge, any cloud, or the data center to any downstream system with built-in end-to-end security and provenance. CDP Data Hub uses Apache NiFi for flow management and Apache Kafka for streams messaging—both part of Cloudera DataFlow, a comprehensive, real-time streaming data platform that delivers key insights and immediate actionable intelligence.
Data Engineering: Enrich, refine, structure, and prepare data for the lifecycle
Cloudera Data Engineering helps enrich, transform, and cleanse data, making it easy to create, execute, and manage end-to-end data pipelines. It executes a wide range of data processing workloads in an extremely high-performance manner including batch and real-time stream processing using Apache Spark and Spark Streaming, supported by multiple storage options including Apache HBase, Apache Kudu, and cloud object storage.
Data Warehouse: Provide self-service access to reporting for the lifecycle
Deliver business insights on massive amounts of verified data to thousands of users at extreme speed and scale without compromising compliance and blowing budgets. Seamlessly and securely moving on-premises workloads to any cloud, Cloudera Data Warehouse outperforms shadow IT by keeping up with evolving business requirements and meeting SLAs with self-service access to reports, dashboards, and advanced analytics.
Operational DB: Serve all types of data from all sources for the lifecycle
Cloudera Operational Database serves structured data alongside unstructured data within a unified end-to-end open-source platform, ensuring decision making is driven by stream processing and real-time analytics on continuously changing data. Users can serve real-time data at scale, with high concurrency and low latency, and data science at scale in order to easily build, score, and deploy machine learning models into production.
Machine Learning: Operationalize predicting for the lifecycle
Accelerate enterprise data science from research to production at scale with self-service, collaborative workflows for building and operationalizing machine learning models. Using Python, R, and Scala directly in the web browser, Cloudera Machine Learning delivers a powerful self-service experience for data science teams to develop and prototype new machine learning capabilities and easily deploy them to production.
SDX: Ensure security, governance, and lineage across the lifecycle
Cloudera SDX (Shared Data Experience) provides an enterprisewide data security and governance fabric that binds the data lifecycle. SDX enables data and metadata security and governance policies to be set once and automatically enforced across the data lifecycle in hybrid, private, or multi-cloud environments, delivering safe and compliant data access across the organization.
Control Plane: Manage CDP services with common tools across the lifecycle
Manage, monitor, and orchestrate all CDP services from a single pane of glass with consistent security and governance. Consisting of Workload Manager, Replication Manager, Data Catalog, and Management Console, Control Plane delivers a powerful set of tools that provide data management, workload analysis, data movement and data discovery capabilities that enable multi-functional analytics, anywhere.