Cloudera named a leader in 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems Get the report
Announcing GA of DataFlow Designer for self-service, no-code data flow design
Overview
 

Control data distribution while allowing the flexibility to deliver data anywhere.


CDF-PC offers a flow-based low-code development paradigm that aligns best with how developers design, develop, and test data distribution pipelines. With over 450+ connectors and processors across the ecosystem of hybrid cloud services—including data lakes, lakehouses, cloud warehouses, and on-premises sources—CDF-PC provides indiscriminate data distribution. These data distribution flows can then be version-controlled into a catalog where operators can self-serve deployments to different runtimes.

CLOUDERA DATAFLOW FOR PUBLIC CLOUD

Universal data distribution powered by Apache NiFi

CDF for Public Cloud diagram

Connect to any data source anywhere, process, and deliver to any destination

Use cases

  • Serverless no-code microservices
  • Near real-time file processing
  • Data Lakehouse Ingest
  • Cybersecurity & log optimization
  • IoT & Streaming Data Collection

Serverless no-code microservices


DataFlow Functions is the first visual no-code solution to build microservices with infinite scaling.

By running NiFi flows within AWS Lambda, Azure Functions and Google Cloud Functions, DataFlow Functions is the first solution providing an easy to use no-code UI for building microservices triggered by HTTPS requests. It gives the option to build API endpoint with infinite scaling in a serverless environment in no time.

 

Near real-time file processing


DataFlow Functions easily enables near real time file processing in a serverless architecture.

By running NiFi flows within AWS Lambda, Azure Functions and Google Cloud Functions, DataFlow Functions provides the most cost effective way for processing files whenever files are made available in the object store. Resources are only running when the data is being processed and NiFi no longer needs to be running 24/7. This also provides a fully serverless architecture without any requirement for infrastructure operations cost.

Data lakehouse ingest


Modernize data pipelines with a single tool that works with any data lakehouse or warehouse.

With support for more than 450+ processors, Cloudera DataFlow makes it easy to collect and transform data into the format that your lakehouse of choice requires.

Cloudera DataFlow provides the flexibility to treat unstructured data as such and achieve high throughput by not having to enforce a schema or give unstructured data a structure by applying a schema and use the NiFi expression language or SQL queries to easily transform your data.

 

Cybersecurity & log optimization


Enable data analysts to detect and analyze events faster and more accurately by curating  SIEM data.

Lower the cost of your cybersecurity solution by modernizing the data collection pipelines to collect and filter real-time data from thousands of sources worldwide.

Ingesting all device and application logs into your SIEM solution is not a scalable approach from a cost and performance perspective. Cloudera DataFlow allows you to collect log data from anywhere and filter out the noise, keeping the data stored in your SIEM system manageable.

IoT & streaming data collection


Send data from IoT devices at the edge to a central data flow in the cloud that scales up and down as needed.

Cloudera DataFlow is built for handling streaming data at scale, allowing organizations to start their IoT projects small, but with the confidence that their data flows can manage data bursts caused by adding more source devices as well as handle intermittent connectivity issues.

Key features

DataFlow Designer provides a self-service, no-code UI with iterative testing capabilities that allow every developer to build and validate new data flows in a heartbeat.

DataFlow Functions runtime provides an efficient, cost optimized, scalable way to run NiFi flows in a completely serverless fashion for event-driven use cases.

DataFlow deployments automatically scale up and down NiFi flows based on CPU utilization. Infrastructure costs can be controlled by setting minimum and maximum boundaries for auto-scaling.

Connect to any data source or target using NiFi's rich processor library, including on-premises data sources, cloud data storage, cloud data warehouses, log data sources, cloud data analytics services, or cloud business process services. Developers can also quickly deploy a predefined set of data flows with minimal configuration called ReadyFlows to implement the most common data flow use cases.

Monitor all your NiFI flow deployments in a single dashboard, no matter on which cloud they're running. Track important flow performance metrics by defining KPI alerts for your flow deployments.

Easily provision secure, stable, and scalable endpoints, making it easy for any application to send data to flow deployments.

Universal connectivity

Cloudera DataFlow offers universal connectivity to any system through purpose-built connectors for data streams, databases, data lakes & data warehouses, enterprise applications, and file systems as well as generic connectors leveraging industry standard protocols such as HTTP, Syslog, UDP, TCP, and more. 

FEATURED CONNECTORS

Apache Iceberg logo

Apache Iceberg

DATA LAKES & DATA WAREHOUSES

Apache Kafka logo

Apache Kafka

DATA STREAMS

Azure Dara Lake Storage logo

Azure Dara Lake Storage

DATA LAKES & DATA WAREHOUSES

Google BigQuery logo

Google BigQuery

DATA LAKES & DATA WAREHOUSES

MongoDB logo

MongoDB

DATABASES

Salesforce logo

Salesforce

ENTERPRISE APPLICATIONS

Snowflake logo

Snowflake

DATA LAKES & DATA WAREHOUSES

Splunk logo

Splunk

ENTERPRISE APPLICATIONS

DEVELOP SELF-SERVICE NIFI FLOWS & DEPLOY TO ANY CLOUD
as auto-scaling Kubernetes clusters or serverless NiFi flows

Runtime options in the public cloud

Feature DataFlow Deployments DataFlow Functions

Cloud Runtime

NiFi Clusters using 
Kubernetes/Containers
NiFi flows running on cloud providers’ serverless compute services (AWS Lambda, Azure Functions, and Google Cloud Functions)

Use Case

Use cases that need low latency for high throughput workloads requiring always running NiFi flows Event driven, micro-bursty use cases with no sub-second latency requirement where NiFi flows do not need to run continuously

Benefits

Auto-scaling Kubernetes clusters for long running workflows with centralized monitoring Efficient, cost optimized, scalable way to run NiFi flows serverless allowing developers to focus on business logic

Metering Unit

Cloudera Compute Unit (CCU) Method Invocation Count 

Experience DataFlow for the Public Cloud for yourself

Collect data from the edge 


Manage, control, and monitor the edge for streaming and IoT initiatives and deliver real-time streaming data with no-code ingestion and management with Cloudera Edge Management.

Get started

PRODUCT DOCUMENTATION

Find technical specs, architecture, and tutorials about Cloudera DataFlow for the Public Cloud.

Learn more

CDF FOR THE PUBLIC CLOUD PRICING


Evaluate Cloudera DataFlow for the Public Cloud pricing across public cloud instances.

Get the details

DATAFLOW OVERVIEW TOUR

Get a hands on tour of Cloudera DataFlow for the Public Cloud.

Access now

CLOUDERA COMMUNITY WITH NIFI

Connect with your peers, ask questions, troubleshoot, and learn more about Apache NiFi.

Explore now

NIFI TRAINING

Book a three day hands-on training course on Apache NiFi fundamentals and more.

Go learn

PRODUCT DEMO

Watch a demo of Cloudera DataFlow for the Public Cloud as well as other CDP demos.

Go watch

Demo

DataFlow Functions on Cloudera Data Platform for Public Cloud

Webinar

Moving enterprise data from anywhere to any system made easy

News

Blog: Announcing GA of DataFlow Functions

Webinar

Tame all your streaming data pipelines

World-class training, support, & services

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.