Impala SQL Language Reference

Impala uses SQL as its query language. To protect user investment in skills development and query design, Impala provides a high degree of compatibility with the Hive Query Language (HiveQL):

  • Because Impala uses the same metadata store as Hive to record information about table structure and properties, Impala can access tables defined through the native Impala CREATE TABLE command, or tables created using the Hive data definition language (DDL).
  • Impala supports data manipulation (DML) statements similar to the DML component of HiveQL.
  • Impala provides many built-in functions with the same names and parameter types as their HiveQL equivalents.

Impala supports most of the same statements and clauses as HiveQL, including, but not limited to JOIN, AGGREGATE, DISTINCT, UNION ALL, ORDER BY, LIMIT and (uncorrelated) subquery in the FROM clause. Impala also supports INSERT INTO and INSERT OVERWRITE.

Impala supports data types with the same names and semantics as the equivalent Hive data types: STRING, TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, BOOLEAN, STRING, TIMESTAMP.

For full details about Impala SQL syntax and semantics, see Impala SQL Statements.

Most HiveQL SELECT and INSERT statements run unmodified with Impala. For information about Hive syntax not available in Impala, see SQL Differences Between Impala and Hive.

For a list of the built-in functions available in Impala queries, see Impala Built-In Functions.