Cloudera Impala
Open Source, Real-time Query for Hadoop
Cloudera Impala is an open source Massively Parallel Processing (MPP) query engine that runs natively on Apache Hadoop. The Apache-licensed Impala project brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation. Impala is integrated from the ground up as part of the Hadoop ecosystem and leverages the same flexible file and data formats, metadata, security and resource management frameworks used by MapReduce, Apache Hive, Apache Pig and other components of the Hadoop stack..
With Impala, analysts and data scientists now have the ability to perform real-time, “speed of thought” analytics on data stored in Hadoop via SQL or through Business Intelligence (BI) tools. The result is that large-scale data processing (via MapReduce) and interactive queries can be done on the same system using the same data and metadata – removing the need to migrate data sets into specialized systems and/or proprietary formats simply to perform analysis.
Key Benefits of Impala
Speed to Insight
Perform interactive analytics directly on data stored in Hadoop. Get answers as quickly as you can ask questions, without the bottlenecks caused by data movement and jumping between data silos.
Cost Savings
Reduce data movement as well as duplicate storage with specialized systems by performing interactive analysis directly on full fidelity data.
Full Fidelity Analysis
Ask questions of all your data - without loss of fidelity from aggregations or conforming to fixed schemas.
Familiarity
Leverage existing BI tools and employee skill sets (SQL) to interact with data stored in Hadoop.
Discoverability
Enable more users to interact with more data by providing a single repository and metadata store from source to analysis.
Unification
Leverage the same file and data formats, metadata, security and resource management frameworks you use for the rest of the Hadoop system.