small image Ebook: Apache NiFi for Dummies

Platform features



Data management and analytics functions


Projects & components
CDP
Private Cloud Base Edition

Enterprise
Data Hub
HDP Enterprise
Plus
Distributed batch processing of large data sets Apache Hadoop      
Database for structured data storage of large tables Apache HBase +conn, +indx      
Data warehouse summarization & ad hoc querying Apache Hive      
Metadata store for Hive tables Hive Metastore (HMS)      
Workflow scheduler to manage Hadoop jobs Apache Oozie      
Columnar storage format for Hadoop ecosystem Apache Parquet      
Fast compute engine for ETL, ML, stream processing Apache Spark      
Bulk data between Hadoop and structured datastores Apache Sqoop      
Job scheduling and cluster resource management YARN      
Coordination service for distributed applications Apache Zookeeper      
Store and manage large data sets across a cluster Apache Accumulo      
Metadata management, governance & data catalog Apache Atlas      
OLTP and real-time SQL access of large datasets Apache Phoenix       
Manage data security across the Hadoop ecosystem Apache Ranger      
Smallest, fastest columnar storage for Hadoop Apache ORC      
Data-flow framework for batch, interactive use-cases Apache Tez      
Fast analytical queries on event-driven data Apache Druid      
Perimeter security governing access to Hadoop Apache Knox      
Easy interaction with Spark clusters via REST interface Apache Livy      
Cryptographic key Ranger KMS      
Notebook for interactive analytics Apache Zeppelin      
Data serialization system Apache Avro      
Manage and control Hadoop ecosystem functions Cloudera Manager      
SQL workbench for data warehouses Hue      
Distributed MPP SQL query engine for Hadoop Apache Impala      
Cryptographic key management Key Trustee Server      
Column-oriented data store for fast data analytics Apache Kudu      
Enterprise search platform Apache Solr      
Key Trustee Server hardware security integration Key HSM      
Transparently encrypts and secures data at rest Navigator Encrypt      
Real-time streaming data pipelines and apps Apache Kafka      
Distributed object store for Hadoop Apache Ozone      
Streams Messaging for data ingestion and buffering Apache Kafka      
Monitoring and management of Kafka clusters Streams Messaging Manager      
Replication of cross-cluster Kafka data Streams Replication Manager      
Integrate with data sources from Kafka Kafka Connect      
Governance and management of metadata and schemas Schema Registry      
Auto-balancing of Kafka clusters Cruise Control      
Light-weight stream processing engine for Kafka Kafka Streams      
High-performance format for huge analytic tables Apache Iceberg      
Disaster Recovery & Backups Iceberg Replication      

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.