This is the documentation for CDH 5.0.x. Documentation for other versions is available at Cloudera Documentation.

New Features in Impala

New Features in Impala Version 1.3.3

No new features. This point release is exclusively a bug fix release for an SSL security issue.

  Note: Impala 1.3.3 is available as part of CDH 5.0.5, and under CDH 4.

New Features in Impala Version 1.3.2

No new features. This point release is exclusively a bug fix release for the IMPALA-1019 issue related to HDFS caching.

  Note: Impala 1.3.2 is only available as part of CDH 5.0.4, not under CDH 4.

New Features in Impala Version 1.3.1

This point release is primarily a vehicle to deliver bug fixes. Any new features are minor changes resulting from fixes for performance, reliability, or usability issues.

Because 1.3.1 is the first 1.3.x release for CDH 4, if you are on CDH 4, also consult New Features in Impala Version 1.3.0 for more features that are new to you.

  Note:
  • The Impala 1.3.1 release is available for both CDH 4 and CDH 5. This is the first release in the 1.3.x series for CDH 4.
  • A new impalad startup option, --insert_inherit_permissions, causes Impala INSERT statements to create each new partition with the same HDFS permissions as its parent directory. By default, INSERT statements create directories for new partitions using default HDFS permissions. See INSERT Statement for examples of INSERT statements for partitioned tables.

  • The SHOW FUNCTIONS statement now displays the return type of each function, in addition to the types of its arguments. See SHOW Statement for examples.

  • You can now specify the clause FIELDS TERMINATED BY '\0' with a CREATE TABLE statement to use text data files that use ASCII 0 (nul) characters as a delimiter. See Using Text Data Files with Impala Tables for details.

  • In Impala 1.3.1 and higher, the REGEXP and RLIKE operators now match a regular expression string that occurs anywhere inside the target string, the same as if the regular expression was enclosed on each side by .*. See REGEXP Operator for examples. Previously, these operators only succeeded when the regular expression matched the entire target string. This change improves compatibility with the regular expression support for popular database systems. There is no change to the behavior of the regexp_extract() and regexp_replace() built-in functions.

New Features in Impala Version 1.3.0

  Note:
  • The Impala 1.3.1 release is available for both CDH 4 and CDH 5. This is the first release in the 1.3.x series for CDH 4.
  • The admission control feature lets you control and prioritize the volume and resource consumption of concurrent queries. This mechanism reduces spikes in resource usage, helping Impala to run alongside other kinds of workloads on a busy cluster. It also provides more user-friendly conflict resolution when multiple memory-intensive queries are submitted concurrently, avoiding resource contention that formerly resulted in out-of-memory errors. See Admission Control and Query Queuing for details.

  • Enhanced EXPLAIN plans provide more detail in an easier-to-read format. Now there are four levels of verbosity: the EXPLAIN_LEVEL option can be set from 0 (most concise) to 3 (most verbose). See EXPLAIN Statement for syntax and Understanding Impala Query Performance - EXPLAIN Plans and Query Profiles for usage information.

  • The TIMESTAMP data type accepts more kinds of input string formats through the UNIX_TIMESTAMP function, and produces more varieties of string formats through the FROM_UNIXTIME function. The documentation now also lists more functions for date arithmetic, used for adding and subtracting INTERVAL expressions from TIMESTAMP values. See Date and Time Functions for details.

  • New conditional functions, NULLIF(), NULLIFZERO(), and ZEROIFNULL(), simplify porting SQL containing vendor extensions to Impala. See Conditional Functions for details.

  • New utility function, CURRENT_DATABASE(). See Miscellaneous Functions for details.

  • Integration with the YARN resource management framework. Only available in combination with CDH 5. This feature makes use of the underlying YARN service, plus an additional service (Llama) that coordinates requests to YARN for Impala resources, so that the Impala query only proceeds when all requested resources are available. See Using YARN Resource Management with Impala (CDH 5 Only) for full details.

      Warning: In CDH 5.0.0, the Llama component is in beta. It is intended for evaluation of resource management in test environments, in combination with Impala and YARN. It is currently not recommended for production deployment.

    On the Impala side, this feature involves some new startup options for the impalad daemon:

    • -enable_rm
    • -llama_host
    • -llama_port
    • -llama_callback_port
    • -cgroup_hierarchy_path

    For details of these startup options, see Modifying Impala Startup Options.

    This feature also involves several new or changed query options that you can set through the impala-shell interpreter and apply within a specific session:

    • MEM_LIMIT: the function of this existing option changes when Impala resource management is enabled.
    • REQUEST_POOL: a new option. (Renamed to RESOURCE_POOL in Impala 1.3.0.)
    • V_CPU_CORES: a new option.
    • RESERVATION_REQUEST_TIMEOUT: a new option.

    For details of these query options, see impala-shell Query Options for Resource Management.

New Features in Impala Version 1.2.4

  Note: Impala 1.2.4 works with CDH 4. It is primarily a bug fix release for Impala 1.2.3, plus some performance enhancements for the catalog server to minimize startup and DDL wait times for Impala deployments with large numbers of databases, tables, and partitions.
  • On Impala startup, the metadata loading and synchronization mechanism has been improved and optimized, to give more responsiveness when starting Impala on a system with a large number of databases, tables, or partitions. The initial metadata loading happens in the background, allowing queries to be run before the entire process is finished. When a query refers to a table whose metadata is not yet loaded, the query waits until the metadata for that table is loaded, and the load operation for that table is prioritized to happen first.

  • Formerly, if you created a new table in Hive, you had to issue the INVALIDATE METADATA statement (with no table name) which was an expensive operation that reloaded metadata for all tables. Impala did not recognize the name of the Hive-created table, so you could not do INVALIDATE METADATA new_table to get the metadata for just that one table. Now, when you issue INVALIDATE METADATA table_name, Impala checks to see if that name represents a table created in Hive, and if so recognizes the new table and loads the metadata for it. Additionally, if the new table is in a database that was newly created in Hive, Impala also recognizes the new database.

  • If you issue INVALIDATE METADATA table_name and the table has been dropped through Hive, Impala will recognize that the table no longer exists.

  • New startup options let you control the parallelism of the metadata loading during startup for the catalogd daemon:

    • --load_catalog_in_background makes Impala load and cache metadata using background threads after startup. It is true by default. Previously, a system with a large number of databases, tables, or partitions could be unresponsive or even time out during startup.

    • --num_metadata_loading_threads determines how much parallelism Impala devotes to loading metadata in the background. The default is 16. You might increase this value for systems with huge numbers of databases, tables, or partitions. You might lower this value for busy systems that are CPU-constrained due to jobs from components other than Impala.

New Features in Impala Version 1.2.3

  Note: Impala 1.2.3 works with CDH 4 and with CDH 5 beta 2. The resource management feature requires CDH 5 beta.

Impala 1.2.3 contains exactly the same feature set as Impala 1.2.2. Its only difference is one additional fix for compatibility with Parquet files generated outside of Impala by components such as Hive, Pig, or MapReduce. See Cloudera Impala Known Issues and Workarounds for details of that fix. If you are upgrading from Impala 1.2.1 or earlier, see New Features in Impala Version 1.2.2 for the latest added features.

New Features in Impala Version 1.2.2

  Note: Impala 1.2.2 works with CDH 4. Its feature set is a superset of features in the Impala 1.2.0 beta, with the exception of resource management, which relies on CDH 5.

Impala 1.2.2 includes new features for performance, security, and flexibility. The major enhancements over 1.2.1 are performance related, primarily for join queries.

New user-visible features include:

  • Join order optimizations. This highly valuable feature automatically distributes and parallelizes the work for a join query to minimize disk I/O and network traffic. The automatic optimization reduces the need to use query hints or to rewrite join queries with the tables in a specific order based on size or cardinality. The new COMPUTE STATS statement gathers statistical information about each table that is crucial for enabling the join optimizations. See Performance Considerations for Join Queries for details.

  • COMPUTE STATS statement to collect both table statistics and column statistics with a single statement. Intended to be more comprehensive, efficient, and reliable than the corresponding Hive ANALYZE TABLE statement, which collects statistics in multiple phases through MapReduce jobs. These statistics are important for query planning for join queries, queries on partitioned tables, and other types of data-intensive operations. For optimal planning of join queries, you need to collect statistics for each table involved in the join. See COMPUTE STATS Statement for details.

  • Reordering of tables in a join query can be overridden by the STRAIGHT_JOIN operator, allowing you to fine-tune the planning of the join query if necessary, by using the original technique of ordering the joined tables in descending order of size. See Overriding Join Reordering with STRAIGHT_JOIN for details.

  • The CROSS JOIN clause in the SELECT statement to allow Cartesian products in queries, that is, joins without an equality comparison between columns in both tables. Because such queries must be carefully checked to avoid accidental overconsumption of memory, you must use the CROSS JOIN operator to explicitly select this kind of join. See Cross Joins and Cartesian Products with the CROSS JOIN Operator for examples.

  • The ALTER TABLE statement has new clauses that let you fine-tune table statistics. You can use this technique as a less-expensive way to update specific statistics, in case the statistics become stale, or to experiment with the effects of different data distributions on query planning.

  • LDAP username/password authentication in JDBC/ODBC. See Enabling LDAP Authentication for Impala for details.

  • GROUP_CONCAT() aggregate function to concatenate column values across all rows of a result set.

  • The INSERT statement now accepts hints, [SHUFFLE] and [NOSHUFFLE], to influence the way work is redistributed during INSERT...SELECT operations. The hints are primarily useful for inserting into partitioned Parquet tables, where using the [SHUFFLE] hint can avoid problems due to memory consumption and simultaneous open files in HDFS, by collecting all the new data for each partition on a specific node.

  • Several built-in functions and operators are now overloaded for more numeric data types, to reduce the requirement to use CAST() for type coercion in INSERT statements. For example, the expression 2+2 in an INSERT statement formerly produced a BIGINT result, requiring a CAST() to be stored in an INT variable. Now, addition, subtraction, and multiplication only produce a result that is one step "bigger" than their arguments, and numeric and conditional functions can return SMALLINT, FLOAT, and other smaller types rather than always BIGINT or DOUBLE.

  • New fnv_hash() built-in function for constructing hashed values. See Mathematical Functions for details.

  • The clause STORED AS PARQUET is accepted as an equivalent for STORED AS PARQUETFILE. This more concise form is recommended for new code.

Because Impala 1.2.2 builds on a number of features introduced in 1.2.1, if you are upgrading from an older 1.1.x release straight to 1.2.2, also review New Features in Impala Version 1.2.1 to see features such as the SHOW TABLE STATS and SHOW COLUMN STATS statements, and user-defined functions (UDFs).

New Features in Impala Version 1.2.1

  Note: Impala 1.2.1 works with CDH 4. Its feature set is a superset of features in the Impala 1.2.0 beta, with the exception of resource management, which relies on CDH 5.

Impala 1.2.1 includes new features for security, performance, and flexibility.

New user-visible features include:

  • SHOW TABLE STATS table_name and SHOW COLUMN STATS table_name statements, to verify that statistics are available and to see the values used during query planning.

  • CREATE TABLE AS SELECT syntax, to create a new table and transfer data into it in a single operation.

  • OFFSET clause, for use with the ORDER BY and LIMIT clauses to produce "paged" result sets such as items 1-10, then 11-20, and so on.

  • NULLS FIRST and NULLS LAST clauses to ensure consistent placement of NULL values in ORDER BY queries.

  • New built-in functions: least(), greatest(), initcap().

  • New aggregate function: ndv(), a fast alternative to COUNT(DISTINCT col) returning an approximate result.

  • The LIMIT clause can now accept a numeric expression as an argument, rather than only a literal constant.

  • The SHOW CREATE TABLE statement displays the end result of all the CREATE TABLE and ALTER TABLE statements for a particular table. You can use the output to produce a simplified setup script for a schema.

  • The --idle_query_timeout and --idle_session_timeout options for impalad control the time intervals after which idle queries are cancelled, and idle sessions expire. See Setting Timeout Periods for Daemons, Queries, and Sessions for details.

  • User-defined functions (UDFs). This feature lets you transform data in very flexible ways, which is important when using Impala as part of an ETL or ELT pipeline. Prior to Impala 1.2, using UDFs required switching into Hive. Impala 1.2 can run scalar UDFs and user-defined aggregate functions (UDAs). Impala can run high-performance functions written in C++, or you can reuse existing Hive functions written in Java.

    You create UDFs through the CREATE FUNCTION statement and drop them through the DROP FUNCTION statement. See User-Defined Functions (UDFs) for instructions about coding, building, and deploying UDFs, and CREATE FUNCTION Statement and DROP FUNCTION Statement for related SQL syntax.

  • A new service automatically propagates changes to table data and metadata made by one Impala node, sending the new or updated metadata to all the other Impala nodes. The automatic synchronization mechanism eliminates the need to use the INVALIDATE METADATA and REFRESH statements after issuing Impala statements such as CREATE TABLE, ALTER TABLE, DROP TABLE, INSERT, and LOAD DATA.

    For even more precise synchronization, you can enable the SYNC_DDL query option before issuing a DDL, INSERT, or LOAD DATA statement. This option causes the statement to wait, returning only after the catalog service has broadcast the applicable changes to all Impala nodes in the cluster.

      Note:

    Because the catalog service only monitors operations performed through Impala, INVALIDATE METADATA and REFRESH are still needed on the Impala side after creating new tables or loading data through the Hive shell or by manipulating data files directly in HDFS. Because the catalog service broadcasts the result of the REFRESH and INVALIDATE METADATA statements to all Impala nodes, when you do need to use those statements, you can do so a single time rather than on every Impala node.

    This service is implemented by the catalogd daemon. See The Impala Catalog Service for details.

  • CREATE TABLE ... AS SELECT syntax, to create a table and copy data into it in a single operation. See CREATE TABLE Statement for details.

  • The CREATE TABLE and ALTER TABLE statements have new clauses TBLPROPERTIES and WITH SERDEPROPERTIES. The TBLPROPERTIES clause lets you associate arbitrary items of metadata with a particular table as key-value pairs. The WITH SERDEPROPERTIES clause lets you specify the serializer/deserializer (SerDes) classes that read and write data for a table; although Impala does not make use of these properties, sometimes particular values are needed for Hive compatibility. See CREATE TABLE Statement and ALTER TABLE Statement for details.

  • Impersonation support lets you authorize certain OS users associated with applications (for example, hue), to submit requests using the credentials of other users. Only available in combination with CDH 5. See Configuring Per-User Access for Hue for details.

  • Enhancements to EXPLAIN output. In particular, when you enable the new EXPLAIN_LEVEL query option, the EXPLAIN and PROFILE statements produce more verbose output showing estimated resource requirements and whether table and column statistics are available for the applicable tables and columns. See EXPLAIN Statement for details.

  • SHOW CREATE TABLE summarizes the effects of the original CREATE TABLE statement and any subsequent ALTER TABLE statements, giving you a CREATE TABLE statement that will re-create the current structure and layout for a table.

  • The LIMIT clause for queries now accepts an arithmetic expression, in addition to numeric literals.

New Features in Impala Version 1.2.0 (Beta)

  Note: The Impala 1.2.0 beta release only works in combination with the beta version of CDH 5. The Impala 1.2.0 software is bundled together with the CDH 5 beta 1 download.

The Impala 1.2.0 beta includes new features for security, performance, and flexibility.

New user-visible features include:

  • User-defined functions (UDFs). This feature lets you transform data in very flexible ways, which is important when using Impala as part of an ETL or ELT pipeline. Prior to Impala 1.2, using UDFs required switching into Hive. Impala 1.2 can run scalar UDFs and user-defined aggregate functions (UDAs). Impala can run high-performance functions written in C++, or you can reuse existing Hive functions written in Java.

    You create UDFs through the CREATE FUNCTION statement and drop them through the DROP FUNCTION statement. See User-Defined Functions (UDFs) for instructions about coding, building, and deploying UDFs, and CREATE FUNCTION Statement and DROP FUNCTION Statement for related SQL syntax.

  • A new service automatically propagates changes to table data and metadata made by one Impala node, sending the new or updated metadata to all the other Impala nodes. The automatic synchronization mechanism eliminates the need to use the INVALIDATE METADATA and REFRESH statements after issuing Impala statements such as CREATE TABLE, ALTER TABLE, DROP TABLE, INSERT, and LOAD DATA.

      Note:

    Because this service only monitors operations performed through Impala, INVALIDATE METADATA and REFRESH are still needed on the Impala side after creating new tables or loading data through the Hive shell or by manipulating data files directly in HDFS. Because the catalog service broadcasts the result of the REFRESH and INVALIDATE METADATA statements to all Impala nodes, when you do need to use those statements, you can do so a single time rather than on every Impala node.

    This service is implemented by the catalogd daemon. See The Impala Catalog Service for details.

  • Integration with the YARN resource management framework. Only available in combination with CDH 5. This feature makes use of the underlying YARN service, plus an additional service (Llama) that coordinates requests to YARN for Impala resources, so that the Impala query only proceeds when all requested resources are available. See Using YARN Resource Management with Impala (CDH 5 Only) for full details.

    On the Impala side, this feature involves some new startup options for the impalad daemon:

    • -enable_rm
    • -llama_host
    • -llama_port
    • -llama_callback_port
    • -cgroup_hierarchy_path

    For details of these startup options, see Modifying Impala Startup Options.

    This feature also involves several new or changed query options that you can set through the impala-shell interpreter and apply within a specific session:

    • MEM_LIMIT: the function of this existing option changes when Impala resource management is enabled.
    • YARN_POOL: a new option. (Renamed to RESOURCE_POOL in Impala 1.3.0.)
    • V_CPU_CORES: a new option.
    • RESERVATION_REQUEST_TIMEOUT: a new option.

    For details of these query options, see impala-shell Query Options for Resource Management.

  • CREATE TABLE ... AS SELECT syntax, to create a table and copy data into it in a single operation. See CREATE TABLE Statement for details.

  • The CREATE TABLE and ALTER TABLE statements have a new TBLPROPERTIES clause that lets you associate arbitrary items of metadata with a particular table as key-value pairs. See CREATE TABLE Statement and ALTER TABLE Statement for details.

  • Impersonation support lets you authorize certain OS users associated with applications (for example, hue), to submit requests using the credentials of other users. Only available in combination with CDH 5. See Configuring Per-User Access for Hue for details.

  • Enhancements to EXPLAIN output. In particular, when you enable the new EXPLAIN_LEVEL query option, the EXPLAIN and PROFILE statements produce more verbose output showing estimated resource requirements and whether table and column statistics are available for the applicable tables and columns. See EXPLAIN Statement for details.

New Features in Impala Version 1.1.1

Impala 1.1.1 includes new features for security and stability.

New user-visible features include:

  • Additional security feature: auditing. New startup options for impalad let you capture information about Impala queries that succeed or are blocked due to insufficient privileges. To take full advantage of this feature with Cloudera Manager, upgrade to Cloudera Manager 4.7 or higher. For details, see Impala Security Configuration .
  • Parquet data files generated by Impala 1.1.1 are now compatible with the Parquet support in Hive. See Cloudera Impala Incompatible Changes for the procedure to update older Impala-created Parquet files to be compatible with the Hive Parquet support.
  • Additional improvements to stability and resource utilization for Impala queries.
  • Additional enhancements for compatibility with existing file formats.

New Features in Impala Version 1.1

Impala 1.1 includes new features for security, performance, and usability.

New user-visible features include:

  • Extensive new security features, built on top of the Sentry open source project. Impala now supports fine-grained authorization based on roles. A policy file determines which privileges on which schema objects (servers, databases, tables, and HDFS paths) are available to users based on their membership in groups. By assigning privileges for views, you can control access to table data at the column level. For details, see Impala Security Configuration .
  • Impala 1.1 works with Cloudera Manager 4.6 or higher. To use Cloudera Manager to manage authorization for the Impala web UI (the web pages served from port 25000 by default), use Cloudera Manager 4.6.2 or higher.
  • Impala can now create, alter, drop, and query views. Views provide a flexible way to set up simple aliases for complex queries; hide query details from applications and users; and simplify maintenance as you rename or reorganize databases, tables, and columns. See the overview section Views and the statements CREATE VIEW Statement, ALTER VIEW Statement, and DROP VIEW Statement.
  • Performance is improved through a number of automatic optimizations. Resource consumption is also reduced for Impala queries. These improvements apply broadly across all kinds of workloads and file formats. The major areas of performance enhancement include:
    • Improved disk and thread scheduling, which applies to all queries.
    • Improved hash join and aggregation performance, which applies to queries with large build tables or a large number of groups.
    • Dictionary encoding with Parquet, which applies to Parquet tables with short string columns.
    • Improved performance on systems with SSDs, which applies to all queries and file formats.
  • Some new built-in functions are implemented: translate() to substitute characters within strings, user() to check the login ID of the connected user.
  • The new WITH clause for SELECT statements lets you simplify complicated queries in a way similar to creating a view. The effects of the WITH clause only last for the duration of one query, unlike views, which are persistent schema objects that can be used by multiple sessions or applications. See WITH Clause.
  • An enhancement to DESCRIBE statement, DESCRIBE FORMATTED table_name, displays more detailed information about the table. This information includes the file format, location, delimiter, ownership, external or internal, creation and access times, and partitions. The information is returned as a result set that can be interpreted and used by a management or monitoring application. See DESCRIBE Statement.
  • You can now insert a subset of columns for a table, with other columns being left as all NULL values. Or you can specify the columns in any order in the destination table, rather than having to match the order of the corresponding columns in the source. VALUES clause. This feature is known as "column permutation". See INSERT Statement.
  • The new LOAD DATA statement lets you load data into a table directly from an HDFS data file. This technique lets you minimize the number of steps in your ETL process, and provides more flexibility. For example, you can bring data into an Impala table in one step. Formerly, you might have created an external table where the data files are not entirely under your control, or copied the data files to Impala data directories manually, or loaded the original data into one table and then used the INSERT statement to copy it to a new table with a different file format, partitioning scheme, and so on. See LOAD DATA Statement.
  • Improvements to Impala-HBase integration:
  • You can issue REFRESH as a SQL statement through any of the programming interfaces that Impala supports. REFRESH formerly had to be issued as a command through the impala-shell interpreter, and was not available through a JDBC or ODBC API call. As part of this change, the functionality of the REFRESH statement is divided between two statements. In Impala 1.1, REFRESH requires a table name argument and immediately reloads the metadata; the new INVALIDATE METADATA statement works the same as the Impala 1.0 REFRESH did: the table name argument is optional, and the metadata for one or all tables is marked as stale, but not actually reloaded until the table is queried. When you create a new table in the Hive shell or through a different Impala node, you must enter INVALIDATE METADATA with no table parameter before you can see the new table in impala-shell. See REFRESH Statement and INVALIDATE METADATA Statement.

New Features in Impala Version 1.0.1

The primary enhancements in Impala 1.0.1 are internal, for compatibility with the new Cloudera Manager 4.6 release. Try out the new Impala Query Monitoring feature in Cloudera Manager 4.6, which requires Impala 1.0.1.

New user-visible features include:

  • The VALUES clause lets you INSERT one or more rows using literals, function return values, or other expressions. For performance and scalability, you should still use INSERT ... SELECT for bringing large quantities of data into an Impala table. The VALUES clause is a convenient way to set up small tables, particularly for initial testing of SQL features that do not require large amounts of data. See VALUES Clause for details.
  • The -B and -o options of the impala-shell command can turn query results into delimited text files and store them in an output file. The plain text results are useful for using with other Hadoop components or Unix tools. In benchmark tests, it is also faster to produce plain rather than pretty-printed results, and write to a file rather than to the screen, giving a more accurate picture of the actual query time.
  • Several bug fixes. See Issues Fixed in the 1.0.1 Release for details.

New Features in Impala Version 1.0

This version has multiple performance improvements and adds the following functionality:

New Features in Version 0.7 of the Cloudera Impala Beta Release

This version has multiple performance improvements and adds the following functionality:

  • Several bug fixes. See Known Issues Fixed in Version 0.7 of the Beta Release.
  • Support for the Parquet file format. For more information on file formats, see Understanding File Formats.
  • Added support for Avro.
  • Support for the memory limits. For more information, see the example on modifying memory limits in Modifying Impala Startup Options.
  • Bigger and faster joins through the addition of partitioned joins to the already supported broadcast joins.
  • Fully distributed aggregations.
  • Fully distributed top-n computation.
  • Support for creating and altering tables.
  • Support for GROUP BY with floats and doubles.

In this version, both CDH 4.1 and 4.2 are supported, but due to performance improvements added, we highly recommend you use CDH 4.2 or higher to see the full benefit. If you are using Cloudera Manager, version 4.5 is required.

New Features in Version 0.6 of the Cloudera Impala Beta Release

  • Several bug fixes. See Known Issues Fixed in Version 0.6 of the Beta Release.
  • Added support for Impala on SUSE and Debian/Ubuntu. Impala is now supported on:
    • RHEL5.7/6.2 and Centos5.7/6.2
    • SUSE 11 with Service Pack 1 or later
    • Ubuntu 10.04/12.04 and Debian 6.03
  • Cloudera Manager 4.5 and CDH 4.2 support Impala 0.6.
  • Support for the RCFile file format. For more information on file formats, see Understanding File Formats.

New Features in Version 0.5 of the Cloudera Impala Beta Release

New Features in Version 0.4 of the Cloudera Impala Beta Release

  • Several bug fixes. See Known Issues Fixed in Version 0.4 of the Beta Release.
  • Added support for Impala on RHEL5.7/Centos5.7. Impala is now supported on RHEL5.7/6.2 and Centos5.7/6.2.
  • Cloudera Manager 4.1.3 supports Impala 0.4.
  • The Impala debug webserver now has the ability to serve static files from ${IMPALA_HOME}/www. This can be disabled by setting --enable_webserver_doc_root=false on the command line. As a result, Impala now uses the Twitter Bootstrap library to style its debug webpages, and the /queries page now tracks the last 25 queries run by each Impala daemon.
  • Additional metrics available on the Impala Debug Webpage.

New Features in Version 0.3 of the Cloudera Impala Beta Release

  • Several bug fixes. See Known Issues Fixed in Version 0.3 of the Beta Release.
  • The state-store-service binary has been renamed statestored.
  • The location of the Impala configuration files has changed from the /usr/lib/impala/conf directory to the /etc/impala/conf directory.

New Features in Version 0.2 of the Cloudera Impala Beta Release

Page generated September 3, 2015.