Major Release Addresses the Increase of Streaming Data Ingest from Internet of Things with Enterprise-Ready Security and Management
PALO ALTO, Calif., February 18, 2016 — Cloudera, the global provider of the fastest, easiest, and most secure data management and analytics platform built on Apache Hadoop and the latest open source technologies, today announced the latest major release of Apache Kafka, a highly-scalable, fault-tolerant publish-subscribe messaging system built for real-time data streaming. Integrated into Cloudera Enterprise, this newest version of Kafka brings critical security features, advancements in multi-tenant operations, and a simplified development experience for big data pipelines, enabling users to more easily ingest and tap into the value of the growing volumes of data streaming from today’s world of connected devices.
McKinsey estimates that by 2020, close to 30 billion devices will be connected to the Internet of Things (IoT) and 40% of the value from this can be unlocked by enabling interoperability and combining data from multiple of these sources and IoT systems. Companies across every business sector are realizing the potential of this real-time, streaming data to drive new value and competitive advantages. Whether it’s the ability to access real-time data combined with historical data to provide insights that drive customer engagement, dramatically improve patient care, increase fraud detection, or power the next wave of digital enterprises, it’s hard to ignore the role streaming data plays in a modern enterprise. Kafka is designed for companies to quickly harness this data at any scale.
Cloudera customer, Cisco WebEx a leader in conferencing services, was able to improve its customer ratings and uncovered up to 17 times more fraud when it moved from several data silos to a unified data discovery and analytics environment based on Cloudera Enterprise and Cisco UCS Servers. WebEx processes real-time streaming data via Apache Spark and shares the data with its services and fraud teams via Kafka so they’re alerted of any operational anomalies as conferences are underway and can act on or fix any issues immediately.
Additionally, Cerner has developed patient monitoring solutions that combine multiple healthcare data sources to detect dangerous blood infections that require immediate attention, ultimately saving hundreds of patients’ lives. Bidtellect ensures advertisers across the globe benefit from the same intelligence by using Kafka to stream data from multiple locations into Cloudera’s enterprise data hub. Finally, Cox Automotive has developed real-time dashboards to monitor core applications and IT metrics using Spark Streaming and Kafka.
“Secure, reliable pipelines for real-time data has never been more important. Our customers in every industry are facing a huge challenge: ingesting huge volumes of data from the growing wave of IoT-connected devices, especially as they’re looking to secure and manage this data as it streams into their enterprise data hub,” said Charles Zedlewski, vice president, Products at Cloudera. “Now that the latest version of Kafka, is integrated directly into Cloudera’s platform, our customers can ensure their data pipelines meet the same stringent security requirements as the rest of their business. With added enterprise capabilities such as rolling restarts and industry-leading monitoring and troubleshooting, customers are able to focus on the value these new data sources and applications provide, not on manual administration of the underlying tools.”
To facilitate large-scale, real-time data ingest and egress within production Hadoop environments, Cloudera has incorporated the latest release of Kafka into its distribution to provide secure streaming, reliable multi-tenancy, and a simplified development experience. Combined with the functionality of Cloudera Enterprise 5.5, this version adds always-on availability and more robust monitoring and troubleshooting, along with connections to the leading third-party stream processing and data integration tools. Specific feature advancements include:
- Robust Security: End-to-end wire encryption protects data moving throughout the system and across data center boundaries, and Kerberos authentication prevents unauthorized access unified, standard identity management regime that spans across the platform
- Reliable Multi-Tenancy: Throttle individual clients or tenants based on resource constraints to reliably scale and support growing data volumes and sources, without compromising other users
- Enterprise Management: Cloudera Manager provides easy deployment of Kafka as part of Cloudera Enterprise, with monitoring and customized alerting as part of the platform. With rolling restarts for Kafka, configurable replication policies, and the fastest time-to-resolution troubleshooting, Cloudera provides always-on availability for Kafka and the pipelines that depend on it.
- Simple, End-to-End Pipelines: A new, simpler Java API improves the developer experience for connecting Kafka to the rest of the big data ecosystem, including tools like Apache Flume and Spark.
By taking advantage of Kafka as part of a Cloudera Enterprise subscription, data engineers can stream, process, and serve data in real-time, all within a single, unified platform. With access to Cloudera’s largest partner ecosystem and a vigorous third-party certification program, these users can extend the capabilities of the platform with trusted, direct integrations with the leading data integration and enrichment tools, such as Pentaho, Streamsets, Syncsort, and Talend.
“The direct integration of this new release of Kafka into Cloudera Enterprise is great news for the Kafka community as it provides a level of trust that Kafka is supported by a leading Hadoop distribution provider and part of a production-ready modern data platform,” said Eddie White, EVP of business development at Pentaho, a Hitachi Group Company. “This also benefits Pentaho as a strategic partner as we can bring Kafka into Pentaho Labs through Cloudera's platform to further validate the technology with our data integration and analytics platform to help companies accelerate their IoT investments, now and in the future.”
"StreamSets is excited by the continued development of Apache Kafka as an integral component of Cloudera Enterprise,” said Arvind Prabhakar, chief technology officer, StreamSets. “StreamSets Data Collector, which installs as a Cloudera Manager parcel, combines visual pipeline design and intelligent data monitoring capabilities with Kafka. It enables Cloudera customers to deploy data flows across the platform with unprecedented development ease and operational visibility.”
“Our customers in financial services, healthcare, retail and telecommunications are looking to take advantage of the speed and resiliency of Apache Kafka for low-latency and fault-tolerant data services for an increasing number of use cases such as fraud detection, analytics on telemetry and security data,” said Tendü Yoğurtçu, General Manager of Syncsort’s Big Data business. “Cloudera is supporting this need with timely delivery of the new version of Kafka in Cloudera Enterprise, securing real-time data pipelines. Syncsort’s integration with Kafka helps organizations use a single, easy-to-use software environment to create a data pipeline for diverse enterprise sources, including batch, streaming, mainframe and IoT data.”
"Apache Kafka is rapidly becoming the key messaging protocol for real-time big data scenarios. With this new release from Cloudera, our joint customers can build intelligent applications for smart cities, real-time recommendations, predictive maintenance, precision gaming, and much more,” said Ashley Stirrup, CMO, Talend. “By combining Kafka for ingestion, Apache Spark for data processing and machine learning, customers can leverage ‘in-the-moment’ insights like never before.”
Kafka has risen to become the go-to open source data ingest tool for collecting and transporting high-velocity streams of data from Internet of Things devices and sensors. With the help of Cloudera’s ecosystem of more than 2,000 partners and industry-leading enterprise technologies, Kafka is now better positioned to help businesses take advantage of IoT opportunities at scale.
Cloudera delivers the modern data management and analytics platform built on Apache Hadoop and the latest open source technologies. The world’s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest and most secure data platform available for the modern world. Our customers efficiently capture, store, process and analyze vast amounts of data, empowering them to use advanced analytics to drive business decisions quickly, flexibly and at lower cost than has been possible before. To ensure our customers are successful, we offer comprehensive support, training and professional services. Learn more at cloudera.com.
Connect with Cloudera
Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Edition, Cloudera Navigator Optimizer and CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trademarks of their respective owners.