This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

PARQUET_COMPRESSION_CODEC

When Impala writes Parquet data files using the INSERT statement, the underlying compression is controlled by the PARQUET_COMPRESSION_CODEC query option. The allowed values for this query option are SNAPPY (the default), GZIP, and NONE. The option value is not case-sensitive. See Snappy and GZip Compression for Parquet Data Files for details and examples.

If the option is set to an unrecognized value, all kinds of queries will fail due to the invalid option setting, not just queries involving Parquet tables.

Default: SNAPPY

For information about the Parquet file format, and how compressing the data files affects query performance, see Using the Parquet File Format with Impala Tables.

Page generated September 3, 2015.