Frequently Asked Questions (FAQs)
What is Cloudera?
Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Big Data: The Enterprise Data Hub. Cloudera offers enterprises one place to store, process, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data.
Founded in 2008, Cloudera was the first, and is currently, the leading provider and supporter of Apache Hadoop for the enterprise. Cloudera also offers software for business critical data challenges including storage, access, management, analysis, security, and search.
Customer success is Cloudera's highest priority. We’ve enabled long-term, successful deployments for hundreds of customers, with petabytes of data collectively under management, across diverse industries.
Learn more about Cloudera here.
What is big data?
Generally speaking, the term “Big Data” refers to any data that for whatever reason (not just volume) cannot be affordably managed by your traditional systems. Big Data is a relative concept and is highly contextual to the environment. For example, even if your organization doesn’t accumulate data on a Facebook-like scale, or even if it primarily collects just one type of data, it may well have Big Data challenges as well as opportunities.
Big Data presents a tremendous opportunity for enterprises across industries. By tapping into new volumes and varieties of data, organizations can ask questions about their customers and their business like never before. For example, organizations are using data to deliver a better customer experience, thus resulting in a more loyal customer base from which they can derive greater value. At the same time, with improved insight into business operations, it’s possible to identify areas of inefficiency that, if addressed, can help reduce operating costs.
Learn more about big data here.
Why do customers choose Cloudera?
Cloudera was the first commercial provider of Hadoop-related software and services and has the most customers with enterprise requirements, and the most experience supporting them, in the industry. Cloudera’s combined offering of differentiated software (open and closed source), support, training, professional services, and indemnity brings customers the greatest business value, in the shortest amount of time, at the lowest TCO.
Learn more about why customers choose Cloudera here.
What is an enterprise data hub?
An enterprise data hub is one place to store all your data, for as long as desired or required, in its original fidelity; integrated with existing infrastructure and tools; with the flexibility to run a variety of enterprise workloads -- including batch processing, interactive SQL, enterprise search, and advanced analytics -- together with the robust security, governance, data protection, and management that enterprises require. With an enterprise data hub, leading organizations are changing the way they think about data, transforming it from a cost into an asset.
Learn more about enterprise data hubs here.
What is Hadoop, and what is its role in an enterprise data hub?
The Hadoop project, which Doug Cutting (now Cloudera's Chief Architect) co-founded in 2006, is an effort to create open source implementations of internal systems used by Web-scale companies such as Google, Yahoo!, and Facebook to manage and process massive data volumes. Hadoop, combined with related ecosystem projects, enables distributed, parallel processing of huge amounts of data across industry-standard servers (with storage and processing occurring on the same machines), and it can scale indefinitely.
Hadoop has evolved into a stable, scalable, flexible core for next-generation data management -- yet on its own, it lacks some critical capabilities when deployed as the center of an enterprise data hub. For example, it lacks a comprehensive security model across the entire ecosystem of projects. Hadoop was also built for batch-mode data processing workloads, which limits it to an ancillary position in the data center. (Rather, a central enterprise data hub must have real-time capability.) And Hadoop doesn’t support the range of industry-standard interfaces for query and search applications, among others, that business users require. Cloudera has addressed all these challenges and more with its Cloudera Enterprise Data Hub Edition product.
Learn more about Hadoop here.
What are some common use cases for an enterprise data hub?
A Hadoop-based enterprise data hub allows you to process and access more data than ever before, so it has many near-term (operational) as well as long-term (strategic) use cases across multiple industries. Generally, enterprise data hub use cases fall into these broad categories:
- Transformation and enrichment: Transform and process large amounts of data more quickly, reliably, and affordably (for loading into the data warehouse, for example).
- Active archive: Get access to data that would otherwise be taken offline (typically to tape) due to the high cost of actively managing it.
- Self-service exploratory BI: Allow users to explore data, with full security, using traditional interactive business intelligence tools via SQL and keyword search.
- Advanced analytics: Rather than making them examine samples of data, or snapshots from short time periods, let users combine all historical data, in its full fidelity, for comprehensive analyses.
Learn more about real-world applications for an enterprise data hub here.
What are Cloudera's products?
Cloudera’s platform, which is designed to specifically address customer opportunities and challenges in Big Data, is available in the form of free/unsupported products (CDH or Cloudera Express, for those interested solely in a free Hadoop distribution), or as supported, enterprise-class software (Cloudera Enterprise - in Basic, Flex, and Data Hub editions) in the form of an annual subscription. All the integration work is done for you, and the entire solution is thoroughly tested for enterprise requirements and fully documented.
Learn more about Cloudera products here.
Why do I need a Cloudera Enterprise subscription?
Cloudera Enterprise subscriptions, which include access to differentiated system and data management software, 8x5 or 24x7 support, and indemnity, is an essential ingredient in any sustainable deployment of an enterprise data hub. Furthermore, all Cloudera subscriptions are up for renewal annually, so Cloudera must continually re-prove its value to you.
Learn more about Cloudera support here.
What makes Cloudera's products unique?
Cloudera’s platform has several differentiating attributes that make it unique, including:
- Differences from commercial alternatives: Cloudera offers differentiating capabilities such as production-grade interactive SQL and Search on Hadoop; comprehensive system management with rolling upgrades, automated disaster recovery, centralized security, proactive health checks, and multi-cluster management; and simplified data management with granular auditing and access control capabilities.
- Differences from stock Apache Hadoop: Although Cloudera's platform contains the same code that can be found in the “upstream” Hadoop ecosystem projects, on a regular (quarterly) basis, Cloudera ships new bug fixes and stable features for users of its platform on a quarterly basis (and contributes them to the upstream code base, as well). Thus, Cloudera customers get predictable and regular access to platform improvements, along with the assurances of rigorous testing and upstream compatibility.
Learn more about Cloudera products here.
Do Cloudera’s products work with my existing data management infrastructure?
The Cloudera Connect Partner Program, more than 700 companies strong, and is designed to champion partner advancement and solution development for the Big Data ecosystem. With more partners than any other Hadoop vendor and the only Hadoop provider with a technology certification program, Cloudera ensures consistency, reliability, and tight integration with enterprise environments.
Learn more about Cloudera’s partners here.
Why does open source matter for customers?
Open source licensing and development offers customers powerful benefits, including freedom from lock-in, free no-obligation evaluation, rapid innovation on a global scale, and community-driven development. Freedom from lock-in is particularly important for customers where components that store and process data are involved.
Learn more about why open source matters here.
Is Cloudera's platform open source?
The core of Cloudera’s platform, CDH, is open source (Apache License), so users always have the option to move their data to an alternative -- and thus Cloudera must continually earn your business based on merit. In fact, Cloudera is an open source leader in Big Data, with its employees collectively contributing more code to the Hadoop ecosystem than those of any other company.
Cloudera complements this open core with closed source management software that provides key enterprise functionality requested by customers such as support for rolling upgrades, auditing management, and disaster recovery. That software, however, does not store or process data and thus lock-in is not an issue.
What does Cloudera’s open source leadership mean for customers?
Open source benefits, such as freedom from lock-in, are tangible and time-tested. That said, they are just table stakes when deploying an enterprise data hub based on open source software such as Hadoop.
Cloudera also leads the way to ensure that customer needs for performance, availability, security, and recoverability are met by new features in the Apache code base, and then shipping/supporting those features for customers in our platform. To make that goal possible, Cloudera employs more ecosystem committers, establishes more successful new ecosystem projects, and contributes more code to that ecosystem, than any other vendor.