PALO ALTO, Calif., – January 31, 2017 – Cloudera, the global provider of the fastest, easiest, and most secure data management, analytics and machine learning platform built on the latest open source technologies, today announced that Apache Kudu, the open source software (OSS) storage engine for fast analytics on fast moving data, is now shipping as a generally available component within Cloudera Enterprise 5.10. Kudu simplifies the path to real-time analytics, allowing users to act quickly on data as-it-happens to make better business decisions.
“Real-time data analysis has been a challenge for enterprises because it required a complex lambda architecture to merge together real-time stream processing and batch analytics. Kudu dramatically eases that architecture with a single storage engine that addresses both needs,” said Charles Zedlewski, senior vice president of products at Cloudera. “The high-demand workloads in place today, which include a growing number of new machine-learning models, can identify cybersecurity threats, predict maintenance issues in the Industrial Internet of Things (IIoT), and bring much more accuracy to all types of online reporting.”
Kudu was designed from the ground up to take advantage of innovation in the hardware landscape, which has seen solid state storage, memory, and RAM become more affordable. As a standalone storage engine, Kudu has already proven itself for mission-critical production use in clusters with hundreds of nodes handling many millions of inserts per second. Kudu is purpose-built to enable use cases that require fast, large-scale analytic scans while supporting rapidly updating data - necessary for handling time series data, machine data analytics, online reporting, or other analytic or operational workload needs.
“Apache Kudu is a prime example of how the Apache Hadoop® platform is evolving from a sharply defined set of Apache projects to a mixing and matching of open source and proprietary technologies that form, in essence, a big data operating environment,” said Tony Baer, principal analyst at Ovum. “Kudu bypasses the hurdles associated with complex lambda architectures to address use cases involving fast-changing data, where the ability to rapidly modify and update the database are critical.”
Beta programs for select Cloudera customers, directly and through partners, have driven Kudu into critical production environments. Further adoption is anticipated among Cloudera’s customer base to address the ever-increasing number of use cases that require real-time analytics.
"Achieving compliance and operational reporting alongside analytical success requires both the ability to process large amounts of data to find trends, and to detect and respond to anomalies quickly,” said Michael Reed, director of enterprise information management at Meridian Health. “We're excited about the potential of Kudu to allow us to do analytical and real-time operations in a single place to help us to simplify the systems that we build.”
In addition to Kudu, Cloudera 5.10 (and the release of Cloudera Director 2.3) continues to enhance enterprise-grade capabilities for cloud deployments and improve cost-efficiencies in these environments. New capabilities include:
- Reduced operating costs for batch processing on transient workloads with improved performance of Apache Hive on Amazon S3
- More comprehensive auditing and lineage in the cloud with single-cluster Cloudera Navigator support for Amazon S3
- Reduced time to deploy initial use case with faster first run deployments across cloud environments
In September of 2015, Cloudera announced the public beta release of Apache Kudu, and two months later, Cloudera donated Kudu to the Apache Software Foundation (ASF) to open it to the broader development community - garnering contributions from engineers at Xiaomi, Intel, and others. Kudu is now generally available and shipping as a standard component of Cloudera Enterprise, giving customers a robust set of storage engines - NoSQL, HDFS, object store, and relational - to meet the specific needs of their use case.
Agil Data (SI)
“Apache Kudu represents a major advance in the field of open source database technology, enabling real-time data analytics in ways that were previously very challenging to implement. We see a wide array of uses cases for Kudu, particularly in the InsurTech sector, and expect it to have a positive impact on many of our clients and projects in the coming years.”
-- Cory Isaacson, Executive Chairman
“We’re thrilled to integrate Apache Kudu with Arcadia Enterprise as it provides a real-time and responsive storage engine for data-centric business applications. As an application developer it's great to have a clean API that we can use through Apache Impala, Apache Spark or directly within Arcadia Enterprise. With Kudu, we have finally come to a point where Hadoop goes beyond ingesting and analyzing data to become the de facto place where Arcadia can generate and update data without the need for any data movement.”
-- Shant Hovsepian, co-founder and chief technology officer
Avalon Consulting, LLC
“The main goal of a recent credit card processing project at Avalon was to re-architect our client’s traditional batch-oriented processing system to improve agility and add real-time fraud detection alerts alongside near real-time executive dashboards. Using Kudu and Impala, we were able to meet the sub-two-second response time required for queries from the new system. Achieving this with the existing EDW would have been cost prohibitive compared to leveraging Cloudera, and Kudu helped us meet our most critical latency requirement.”
-- Tom Reidy, chief executive officer
“As Capgemini’s clients are maturing in their usage of big data, analytics and data science, the need for more real-time analytics workloads on big and fast data, including IoT and streaming data, has become a more and more central topic. The GA of Kudu is a great step towards allowing our clients to build even more critical, insights-centric processes on top of their business analytics platform, which will accelerate their Digital Transformation journey.”
RCG Global Services
“RCG Global Services sees Kudu as an important advance in data storage for big data. It provides low-latency and high-performance for reading and writing data on Cloudera clusters to meet the needs of demanding applications, including real-time analytics. At RCG Global Services, we have incorporated Kudu into all of our Cloudera certified RCG|enableTM industry solutions for banking, healthcare, hospitality, insurance, and retail to take advantage of these features.”
-- Rick Skriletz, Global Managing Principal
“Apache Kudu is outstanding advanced columnar storage, which SoftServe trusts to keep data centralized, accessible, and secure. As part of a recent payment processing project involving a ‘Big Four’ professional services firm, SoftServe developers chose Kudu for its ability to support massive data sets while providing transactional consistency. Kudu was the only tool that had both characteristics and passed comprehensive performance and reliability tests, enabling us to deliver an innovative solution that supported our client’s business goals.”
-- Todd Lenox, VP, Digital Partnerships
“Incorporating Apache Kudu into CDH will greatly simplify execution of the mixed workloads our customers increasingly utilize once they migrate their enterprise data warehouse and real-time streams to Hadoop. The Cloudera-certified StreamSets Data Collector natively supports Kudu as a plug-and-play dataflow destination, and StreamSets Dataflow Performance Manager helps assure the continuous availability and accuracy of the data flowing into Kudu.”
-- Arvind Prabhakar, chief technology officer
"Kudu provides us a quantum leap in our client engagements requiring a fast data services layer that can effortlessly handle the high velocity of data in modern digital systems. For example, in the modern high performance IoT system we are designing for our customers, the combination of Apache Spark with Apache Kudu is essential to meet system requirements. A key added benefit for us is that there is no need to retrain our developers who are already skilled in the Apache Hadoop HDFS technology stack in order to effectively use Apache Kudu. We are glad to note the GA of Apache Kudu and expect to use it widely in our client engagements".
- -- Dr. Satya, vice president, TCS Digital Business
"The GA of Kudu within Cloudera Enterprise is an important milestone on the path to streaming analytics. Zoomdata saw the value of Kudu early in its development and worked with the Cloudera engineering team to develop a set of big data analytic capabilities that leverage Kudu. We’re now in a position deliver even more value to Cloudera and Zoomdata’s joint customers through the ability to run visual analytic queries in real time.”
-- Ruhollah Farchtchi, chief technology officer
Additional Resources for Apache Kudu
Additional Resources for Cloudera 5.10
● Learn more on the Cloudera Engineering Blog.
Cloudera delivers the modern data management and analytics platform built on Apache Hadoop and the latest open source technologies. The world’s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest and most secure data platform available for the modern world. Our customers efficiently capture, store, process and analyze vast amounts of data, empowering them to use advanced analytics to drive business decisions quickly, flexibly and at lower cost than has been possible before. To ensure our customers are successful, we offer comprehensive support, training and professional services. Learn more at cloudera.com.
Connect with Cloudera
Read our blogs: cloudera.com/engblog and vision.cloudera.com
Follow us on Twitter: twitter.com/cloudera
Visit us on Facebook: facebook.com/cloudera
Join the Cloudera Community: community.cloudera.com
Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Edition, Cloudera Navigator Optimizer and CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trademarks of their respective owners.