Overview
Control data distribution while allowing the flexibility to deliver data anywhere.
CDF-PC offers a flow-based low-code development paradigm that aligns best with how developers design, develop, and test data distribution pipelines. With over 450+ connectors and processors across the ecosystem of hybrid cloud services—including data lakes, lakehouses, cloud warehouses, and on-premises sources—CDF-PC provides indiscriminate data distribution. These data distribution flows can then be version-controlled into a catalog where operators can self-serve deployments to different runtimes.
CLOUDERA DATAFLOW FOR PUBLIC CLOUD
Universal data distribution powered by Apache NiFi

Connect to any data source anywhere, process, and deliver to any destination
Use cases
Serverless no-code microservices
Near real-time file processing
Data Lakehouse Ingest
Cybersecurity & log optimization
IoT & Streaming Data Collection
Serverless no-code microservices
DataFlow Functions is the first visual no-code solution to build microservices with infinite scaling.
By running NiFi flows within AWS Lambda, Azure Functions and Google Cloud Functions, DataFlow Functions is the first solution providing an easy to use no-code UI for building microservices triggered by HTTPS requests. It gives the option to build API endpoint with infinite scaling in a serverless environment in no time.
Near real-time file processing
DataFlow Functions easily enables near real time file processing in a serverless architecture.
By running NiFi flows within AWS Lambda, Azure Functions and Google Cloud Functions, DataFlow Functions provides the most cost effective way for processing files whenever files are made available in the object store. Resources are only running when the data is being processed and NiFi no longer needs to be running 24/7. This also provides a fully serverless architecture without any requirement for infrastructure operations cost.
Data lakehouse ingest
Modernize data pipelines with a single tool that works with any data lakehouse or warehouse.
With support for more than 450+ processors, Cloudera DataFlow makes it easy to collect and transform data into the format that your lakehouse of choice requires.
Cloudera DataFlow provides the flexibility to treat unstructured data as such and achieve high throughput by not having to enforce a schema or give unstructured data a structure by applying a schema and use the NiFi expression language or SQL queries to easily transform your data.
Cybersecurity & log optimization
Enable data analysts to detect and analyze events faster and more accurately by curating SIEM data.
Lower the cost of your cybersecurity solution by modernizing the data collection pipelines to collect and filter real-time data from thousands of sources worldwide.
Ingesting all device and application logs into your SIEM solution is not a scalable approach from a cost and performance perspective. Cloudera DataFlow allows you to collect log data from anywhere and filter out the noise, keeping the data stored in your SIEM system manageable.
IoT & streaming data collection
Send data from IoT devices at the edge to a central data flow in the cloud that scales up and down as needed.
Cloudera DataFlow is built for handling streaming data at scale, allowing organizations to start their IoT projects small, but with the confidence that their data flows can manage data bursts caused by adding more source devices as well as handle intermittent connectivity issues.
DataFlow Designer provides a self-service, no-code UI with iterative testing capabilities that allow every developer to build and validate new data flows in a heartbeat.
DataFlow Functions runtime provides an efficient, cost optimized, scalable way to run NiFi flows in a completely serverless fashion for event-driven use cases.
DataFlow deployments automatically scale up and down NiFi flows based on CPU utilization. Infrastructure costs can be controlled by setting minimum and maximum boundaries for auto-scaling.
Connect to any data source or target using NiFi's rich processor library, including on-premises data sources, cloud data storage, cloud data warehouses, log data sources, cloud data analytics services, or cloud business process services. Developers can also quickly deploy a predefined set of data flows with minimal configuration called ReadyFlows to implement the most common data flow use cases.
Monitor all your NiFI flow deployments in a single dashboard, no matter on which cloud they're running. Track important flow performance metrics by defining KPI alerts for your flow deployments.
Easily provision secure, stable, and scalable endpoints, making it easy for any application to send data to flow deployments.
Universal connectivity
Cloudera DataFlow offers universal connectivity to any system through purpose-built connectors for data streams, databases, data lakes & data warehouses, enterprise applications, and file systems as well as generic connectors leveraging industry standard protocols such as HTTP, Syslog, UDP, TCP, and more.
FEATURED CONNECTORS

Apache Iceberg
DATA LAKES & DATA WAREHOUSES

Apache Kafka
DATA STREAMS
Azure Dara Lake Storage
DATA LAKES & DATA WAREHOUSES

Google BigQuery
DATA LAKES & DATA WAREHOUSES

MongoDB
DATABASES

Salesforce
ENTERPRISE APPLICATIONS

Snowflake
DATA LAKES & DATA WAREHOUSES

Splunk
ENTERPRISE APPLICATIONS
DEVELOP SELF-SERVICE NIFI FLOWS & DEPLOY TO ANY CLOUD
as auto-scaling Kubernetes clusters or serverless NiFi flows
Runtime options in the public cloud
Feature | DataFlow Deployments | DataFlow Functions |
Cloud Runtime |
NiFi Clusters using Kubernetes/Containers |
NiFi flows running on cloud providers’ serverless compute services (AWS Lambda, Azure Functions, and Google Cloud Functions) |
---|---|---|
Use Case |
Use cases that need low latency for high throughput workloads requiring always running NiFi flows | Event driven, micro-bursty use cases with no sub-second latency requirement where NiFi flows do not need to run continuously |
Benefits |
Auto-scaling Kubernetes clusters for long running workflows with centralized monitoring | Efficient, cost optimized, scalable way to run NiFi flows serverless allowing developers to focus on business logic |
Metering Unit |
Cloudera Compute Unit (CCU) | Method Invocation Count |
Collect data from the edge
Manage, control, and monitor the edge for streaming and IoT initiatives and deliver real-time streaming data with no-code ingestion and management with Cloudera Edge Management.
Get started
PRODUCT DOCUMENTATION
Find technical specs, architecture, and tutorials about Cloudera DataFlow for the Public Cloud.
CDF FOR THE PUBLIC CLOUD PRICING
Evaluate Cloudera DataFlow for the Public Cloud pricing across public cloud instances.
DATAFLOW OVERVIEW TOUR
Get a hands on tour of Cloudera DataFlow for the Public Cloud.
CLOUDERA COMMUNITY WITH NIFI
Connect with your peers, ask questions, troubleshoot, and learn more about Apache NiFi.
NIFI TRAINING
Book a three day hands-on training course on Apache NiFi fundamentals and more.
PRODUCT DEMO
Watch a demo of Cloudera DataFlow for the Public Cloud as well as other CDP demos.