The market for next generation data management software is driven primarily by:
- The explosion of data – amount of data, variety of data types (structured, unstructured and semi-structured) and speed at which data must be stored, processed and analyzed
- The trend toward data-centricity and the information-driven enterprise, resulting in widespread upgrading and re-architecting of data centers
- The emergence of the Internet of Things (IoT)
Hadoop technology is particularly well-suited to address these trends and has quickly become a foundational element in modern enterprise data infrastructure. IDC projects that the Worldwide Big Data Technology market, which includes hardware, software and services, will be $32 billion in 2017.
Similarly, the data-related enterprise software market (data management, data storage and business intelligence) is forecast to be approximately $110 billion in 20181. We have estimated that more than $30 billion of this amount relates to analytical workloads and operational data stores as contrasted with transactional workloads. Analytical workloads are most immediately addressable by an EDH platform and Hadoop technology while also being one of the fastest growing segments of the data-related enterprise software market2.
1 Gartner, Enterprise Software Markets, Worldwide 2011-2018, 2Q 2014 Update
2 the451 Research, Total Data Warehousing 2013-2018, July 2014
Hadoop was co-created in 2006 by Doug Cutting, who has been Cloudera’s Chief Architect since 2009. The idea for the open source project came from Doug’s reading of the seminal papers by Google on MapReduce and the Google File System. Hadoop is a software framework for storage and distributed parallel processing of large data-sets on clusters of commodity hardware -- it scales without limits and creates a single location where any enterprise data can be stored, processed, analyzed and made accessible to other systems and applications. As open source software, it is built and used by a global community of contributors and users.
By 2008, many of the world’s leading technology companies like Yahoo!, Facebook, LinkedIn, eBay and others had adopted Hadoop as the underpinning of their data strategy, but few people recognized the opportunity and potential for Hadoop in the enterprise. Cloudera’s founders came together with a shared vision to make Hadoop reliable and deliver the capabilities enterprise customers require. In 2009, Cloudera introduced the first version of its Hadoop distribution, known as CDH (Cloudera’s distribution including Apache Hadoop). Today, we are shipping the fifth generation of CDH, comprising more than 25 distinct open source projects. Cloudera is the only vendor that delivers a data management solution built on Hadoop that includes comprehensive security, robust system management and unified data management.
Cloudera’s Enterprise Data Hub (EDH)
In the face of a seeming avalanche of data, enterprises are seeking to leverage and produce insights across their businesses from this information but are increasingly concerned with its management, security and governance. While Hadoop is extremely economical in any context, the open source technology by itself is not a complete data management solution for many enterprise customers.
In order to enable organizations to realize the full potential of Hadoop in a manner consistent with enterprise requirements, in 2013 Cloudera introduced the first enterprise data hub, a ‘reference architecture’ that defines a new data management platform built around Hadoop. The power of an EDH stems from its integration with existing systems like databases, enterprise data warehouses (EDWs), data integration tools, analytic applications and others. In this regard, an EDH complements existing systems, allowing enterprises to optimize where workloads occur and enabling an internal service with respect to access and analysis of all an enterprise’s data.
Cloudera’s Hybrid Open Source Software (HOSS) Business Model
Cloudera is a pioneer in bringing the hybrid open source software business model to enterprise infrastructure software. In addition to being the most significant contributor to the development of the Hadoop open source software ecosystem, Cloudera has developed differentiated software like Cloudera Manager (system management), Cloudera Navigator (data management, lineage, governance, & security), Cloudera Director (cloud-based deployment) and other capabilities that augment our open source core for enterprise use. Hive, HBase, Sqoop, Impala, Sentry, Flume and Hue are some of the projects founded by Cloudera that were then contributed to the open source community.
Cloudera is committed to partnering with the open source community and Apache Software Foundation to continue the advancement of Hadoop, lead the establishment of open standards and enable enterprises to fully capitalize on all their data.