Explore and analyze with greater flexibility
Cloudera’s is the only modern analytic database that leverages the flexibility and cost-effective elasticity of the cloud for the fastest time to insights. Avoid rigid data modeling and time-consuming data loading by directly querying data in cloud-native storage like Amazon S3. Dynamically deploy and elastically resize to meet your needs on demand. And Cloudera also provides better price performance compared to other cloud-based analytic databases.
Cloudera Enterprise is Driving Big Data Analytics to the Cloud
Directly query cloud-native storage
Run high-performance SQL analytics directly against data in Amazon S3 without having to move the data into separate storage or transform it into a proprietary format.
Elastically scale for changing demands
Support peak and off-peak usage with a modern, decoupled architecture that allows you to elastically grow and shrink compute and storage independently, as needed.
Transient pay-as-you-go clusters
Spin up and spin down clusters for periodic batch jobs to address changing needs and save on compute hosting costs without time-consuming data-load operations.
Data portability and flexibility
Shared object store means data is not bound to a single application. This same data can be used for multiple applications, including ETL and data science, without moving it to a separate cluster.
Cloudera Enterprise delivers a modern analytic database, powered by Apache Impala (incubating) for high-performance SQL analytics in the cloud. Impala is the only MPP SQL engine with hybrid portability, built to work with data stored on open, shared data platforms like Apache Hadoop’s HDFS file system (on local Amazon EBS storage), Apache Kudu’s columnar storage, and object stores like Amazon S3. With the ability to access data from multiple sources in open formats, Impala lets users query data more quickly and flexibly without rigid data modeling and loading typical of monolithic analytic database architectures. This capability is especially useful in the cloud as you can take advantage of transient clusters and on-demand elasticity to save on cluster-hosting costs.
Paired with the powerful batch processing tool, Hive-on-Spark (which also supports cloud-native data access), you can take full advantage of a shared data layer to support ETL and BI analytics in the cloud. Using Cloudera Director, you can spin up transient clusters for periodic batch and data preparation jobs and terminate the cluster once complete. This data is immediately available for BI and exploration, and Cloudera Director makes it easy to elastically grow and shrink these clusters based on peak usage.