This is the documentation for Cloudera Impala 1.4.0.
Documentation for other versions is available at Cloudera.com.

DESCRIBE Statement

The DESCRIBE statement displays metadata about a table, such as the column names and their data types. Its syntax is:

DESCRIBE [FORMATTED] table

You can use the abbreviation DESC for the DESCRIBE statement.

The DESCRIBE FORMATTED variation displays additional information, in a format familiar to users of Apache Hive. The extra information includes low-level details such as whether the table is internal or external, when it was created, the file format, the location of the data in HDFS, whether the object is a table or a view, and (for views) the text of the query from the view definition.

  Note: The Compressed field is not a reliable indicator of whether the table contains compressed data. It typically always shows No, because the compression settings only apply during the session that loads data and are not stored persistently with the table metadata.

Usage notes:

After the impalad daemons are restarted, the first query against a table can take longer than subsequent queries, because the metadata for the table is loaded before the query is processed. This one-time delay for each table can cause misleading results in benchmark tests or cause unnecessary concern. To "warm up" the Impala metadata cache, you can issue a DESCRIBE statement in advance for each table you intend to access later.

When you are dealing with data files stored in HDFS, sometimes it is important to know details such as the path of the data files for an Impala table, and the host name for the namenode. You can get this information from the DESCRIBE FORMATTED output. You specify HDFS URIs or path specifications with statements such as LOAD DATA and the LOCATION clause of CREATE TABLE or ALTER TABLE. You might also use HDFS URIs or paths with Linux commands such as hadoop and hdfs to copy, rename, and so on, data files in HDFS.

If you connect to different Impala nodes within an impala-shell session for load-balancing purposes, you can enable the SYNC_DDL query option to make each DDL statement wait before returning, until the new or changed metadata has been received by all the Impala nodes. See SYNC_DDL for details.

Each table can also have associated table statistics and column statistics. To see these categories of information, use the SHOW TABLE STATS table_name and SHOW COLUMN STATS table_name statements. See SHOW Statement for details.

  Important: After adding or replacing data in a table used in performance-critical queries, issue a COMPUTE STATS statement to make sure all statistics are up-to-date. Consider updating statistics for a table after any INSERT, LOAD DATA, or CREATE TABLE AS SELECT statement in Impala, or after loading data through Hive and doing a REFRESH table_name in Impala. This technique is especially important for tables that are very large, used in join queries, or both.

Examples:

The following example shows the results of both a standard DESCRIBE and DESCRIBE FORMATTED for different kinds of schema objects:

  • DESCRIBE for a table or a view returns the name, type, and comment for each of the columns. For a view, if the column value is computed by an expression, the column name is automatically generated as _c0, _c1, and so on depending on the ordinal number of the column.
  • A table created with no special format or storage clauses is designated as a MANAGED_TABLE (an "internal table" in Impala terminology). Its data files are stored in an HDFS directory under the default Hive data directory. By default, it uses Text data format.
  • A view is designated as VIRTUAL_VIEW in DESCRIBE FORMATTED output. Some of its properties are NULL or blank because they are inherited from the base table. The text of the query that defines the view is part of the DESCRIBE FORMATTED output.
  • A table with additional clauses in the CREATE TABLE statement has differences in DESCRIBE FORMATTED output. The output for T2 includes the EXTERNAL_TABLE keyword because of the CREATE EXTERNAL TABLE syntax, and different InputFormat and OutputFormat fields to reflect the Parquet file format.
[localhost:21000] > create table t1 (x int, y int, s string);
Query: create table t1 (x int, y int, s string)
[localhost:21000] > describe t1;
Query: describe t1
Query finished, fetching results ...
+------+--------+---------+
| name | type   | comment |
+------+--------+---------+
| x    | int    |         |
| y    | int    |         |
| s    | string |         |
+------+--------+---------+
Returned 3 row(s) in 0.13s
[localhost:21000] > describe formatted t1;
Query: describe formatted t1
Query finished, fetching results ...
+------------------------------+--------------------------------------------------------------------+----------------------+
| name                         | type                                                               | comment              |
+------------------------------+--------------------------------------------------------------------+----------------------+
| # col_name                   | data_type                                                          | comment              |
|                              | NULL                                                               | NULL                 |
| x                            | int                                                                | None                 |
| y                            | int                                                                | None                 |
| s                            | string                                                             | None                 |
|                              | NULL                                                               | NULL                 |
| # Detailed Table Information | NULL                                                               | NULL                 |
| Database:                    | describe_formatted                                                 | NULL                 |
| Owner:                       | cloudera                                                           | NULL                 |
| CreateTime:                  | Mon Jul 22 17:03:16 EDT 2013                                       | NULL                 |
| LastAccessTime:              | UNKNOWN                                                            | NULL                 |
| Protect Mode:                | None                                                               | NULL                 |
| Retention:                   | 0                                                                  | NULL                 |
| Location:                    | hdfs://127.0.0.1:8020/user/hive/warehouse/describe_formatted.db/t1 | NULL                 |
| Table Type:                  | MANAGED_TABLE                                                      | NULL                 |
| Table Parameters:            | NULL                                                               | NULL                 |
|                              | transient_lastDdlTime                                              | 1374526996           |
|                              | NULL                                                               | NULL                 |
| # Storage Information        | NULL                                                               | NULL                 |
| SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                 | NULL                 |
| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat                           | NULL                 |
| OutputFormat:                | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat         | NULL                 |
| Compressed:                  | No                                                                 | NULL                 |
| Num Buckets:                 | 0                                                                  | NULL                 |
| Bucket Columns:              | []                                                                 | NULL                 |
| Sort Columns:                | []                                                                 | NULL                 |
+------------------------------+--------------------------------------------------------------------+----------------------+
Returned 26 row(s) in 0.03s
[localhost:21000] > create view v1 as select x, upper(s) from t1;
Query: create view v1 as select x, upper(s) from t1
[localhost:21000] > describe v1;
Query: describe v1
Query finished, fetching results ...
+------+--------+---------+
| name | type   | comment |
+------+--------+---------+
| x    | int    |         |
| _c1  | string |         |
+------+--------+---------+
Returned 2 row(s) in 0.10s
[localhost:21000] > describe formatted v1;
Query: describe formatted v1
Query finished, fetching results ...
+------------------------------+------------------------------+----------------------+
| name                         | type                         | comment              |
+------------------------------+------------------------------+----------------------+
| # col_name                   | data_type                    | comment              |
|                              | NULL                         | NULL                 |
| x                            | int                          | None                 |
| _c1                          | string                       | None                 |
|                              | NULL                         | NULL                 |
| # Detailed Table Information | NULL                         | NULL                 |
| Database:                    | describe_formatted           | NULL                 |
| Owner:                       | cloudera                     | NULL                 |
| CreateTime:                  | Mon Jul 22 16:56:38 EDT 2013 | NULL                 |
| LastAccessTime:              | UNKNOWN                      | NULL                 |
| Protect Mode:                | None                         | NULL                 |
| Retention:                   | 0                            | NULL                 |
| Table Type:                  | VIRTUAL_VIEW                 | NULL                 |
| Table Parameters:            | NULL                         | NULL                 |
|                              | transient_lastDdlTime        | 1374526598           |
|                              | NULL                         | NULL                 |
| # Storage Information        | NULL                         | NULL                 |
| SerDe Library:               | null                         | NULL                 |
| InputFormat:                 | null                         | NULL                 |
| OutputFormat:                | null                         | NULL                 |
| Compressed:                  | No                           | NULL                 |
| Num Buckets:                 | 0                            | NULL                 |
| Bucket Columns:              | []                           | NULL                 |
| Sort Columns:                | []                           | NULL                 |
|                              | NULL                         | NULL                 |
| # View Information           | NULL                         | NULL                 |
| View Original Text:          | SELECT x, upper(s) FROM t1   | NULL                 |
| View Expanded Text:          | SELECT x, upper(s) FROM t1   | NULL                 |
+------------------------------+------------------------------+----------------------+
Returned 28 row(s) in 0.03s
[localhost:21000] > create external table t2 (x int, y int, s string) stored as parquet location '/user/cloudera/sample_data';
[localhost:21000] > describe formatted t2;
Query: describe formatted t2
Query finished, fetching results ...
+------------------------------+----------------------------------------------------+----------------------+
| name                         | type                                               | comment              |
+------------------------------+----------------------------------------------------+----------------------+
| # col_name                   | data_type                                          | comment              |
|                              | NULL                                               | NULL                 |
| x                            | int                                                | None                 |
| y                            | int                                                | None                 |
| s                            | string                                             | None                 |
|                              | NULL                                               | NULL                 |
| # Detailed Table Information | NULL                                               | NULL                 |
| Database:                    | describe_formatted                                 | NULL                 |
| Owner:                       | cloudera                                           | NULL                 |
| CreateTime:                  | Mon Jul 22 17:01:47 EDT 2013                       | NULL                 |
| LastAccessTime:              | UNKNOWN                                            | NULL                 |
| Protect Mode:                | None                                               | NULL                 |
| Retention:                   | 0                                                  | NULL                 |
| Location:                    | hdfs://127.0.0.1:8020/user/cloudera/sample_data    | NULL                 |
| Table Type:                  | EXTERNAL_TABLE                                     | NULL                 |
| Table Parameters:            | NULL                                               | NULL                 |
|                              | EXTERNAL                                           | TRUE                 |
|                              | transient_lastDdlTime                              | 1374526907           |
|                              | NULL                                               | NULL                 |
| # Storage Information        | NULL                                               | NULL                 |
| SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                 |
| InputFormat:                 | com.cloudera.impala.hive.serde.ParquetInputFormat  | NULL                 |
| OutputFormat:                | com.cloudera.impala.hive.serde.ParquetOutputFormat | NULL                 |
| Compressed:                  | No                                                 | NULL                 |
| Num Buckets:                 | 0                                                  | NULL                 |
| Bucket Columns:              | []                                                 | NULL                 |
| Sort Columns:                | []                                                 | NULL                 |
+------------------------------+----------------------------------------------------+----------------------+
Returned 27 row(s) in 0.17s

Cancellation: Cannot be cancelled.