This is the documentation for Cloudera Impala 1.4.x.
Documentation for other versions is available at Cloudera.com.

Date and Time Functions

The underlying Impala data type for date and time data is TIMESTAMP, which has both a date and a time portion. Functions that extract a single field, such as hour() or minute(), typically return an integer value. Functions that format the date portion, such as date_add() or to_date(), typically return a string value.

You can also adjust a TIMESTAMP value by adding or subtracting an INTERVAL expression. See TIMESTAMP Data Type for details. INTERVAL expressions are also allowed as the second argument for the date_add() and date_sub() functions, rather than integers.

Impala supports the following data and time functions:

add_months(timestamp date, int months), add_months(timestamp date, bigint months)
Purpose: Returns the specified date and time plus some number of months.

Return type: timestamp

Usage notes: Same as months_add(). Available in Impala 1.4 and higher. For compatibility when porting code with vendor extensions.

adddate(timestamp startdate, int days), adddate(timestamp startdate, bigint days),
Purpose: Adds a specified number of days to a TIMESTAMP value. Similar to date_add(), but starts with an actual TIMESTAMP value instead of a string that is converted to a TIMESTAMP.

Return type: timestamp

current_timestamp()
Purpose: Alias for the now() function.

Return type: timestamp

date_add(timestamp startdate, int days), date_add(timestamp startdate, interval_expression)
Purpose: Adds a specified number of days to a TIMESTAMP value. The first argument can be a string, which is automatically cast to TIMESTAMP if it uses the recognized format, as described in TIMESTAMP Data Type. With an INTERVAL expression as the second argument, you can calculate a delta value using other units such as weeks, years, hours, seconds, and so on; see TIMESTAMP Data Type for details.

Return type: timestamp

date_sub(timestamp startdate, int days), date_sub(timestamp startdate, interval_expression)
Purpose: Subtracts a specified number of days from a TIMESTAMP value. The first argument can be a string, which is automatically cast to TIMESTAMP if it uses the recognized format, as described in TIMESTAMP Data Type. With an INTERVAL expression as the second argument, you can calculate a delta value using other units such as weeks, years, hours, seconds, and so on; see TIMESTAMP Data Type for details.

Return type: timestamp

datediff(string enddate, string startdate)
Purpose: Returns the number of days between two dates represented as strings.

Return type: int

day(string date), dayofmonth(string date)
Purpose: Returns the day field from a date represented as a string.

Return type: int

dayname(string date)
Purpose: Returns the day field from a date represented as a string, converted to the string corresponding to that day name. The range of return values is 'Sunday' to 'Saturday'. Used in report-generating queries, as an alternative to calling dayofweek() and turning that numeric return value into a string using a CASE expression.

Return type: string

dayofweek(string date)
Purpose: Returns the day field from a date represented as a string, corresponding to the day of the week. The range of return values is 1 (Sunday) to 7 (Saturday).

Return type: int

dayofyear(timestamp date)
Purpose: Returns the day field from a TIMESTAMP value, corresponding to the day of the year. The range of return values is 1 (January 1) to 366 (December 31 of a leap year).

Return type: int

days_add(timestamp startdate, int days), days_add(timestamp startdate, bigint days)
Purpose: Adds a specified number of days to a TIMESTAMP value. Similar to date_add(), but starts with an actual TIMESTAMP value instead of a string that is converted to a TIMESTAMP.

Return type: timestamp

days_sub(timestamp startdate, int days), days_sub(timestamp startdate, bigint days)
Purpose: Subtracts a specified number of days from a TIMESTAMP value. Similar to date_sub(), but starts with an actual TIMESTAMP value instead of a string that is converted to a TIMESTAMP.

Return type: timestamp

extract(timestamp, string unit)
Purpose: Returns one of the numeric date or time fields from a TIMESTAMP value.

Unit argument: The unit string can be one of year, month, day, hour, minute, second, or millisecond. This argument value is case-insensitive.

Usage notes: Typically used in GROUP BY queries to arrange results by hour, day, month, and so on. You can also use this function in an INSERT ... SELECT into a partitioned table to split up TIMESTAMP values into individual parts, if the partitioned table has separate partition key columns representing year, month, day, and so on. If you need to divide by more complex units of time, such as by week or by quarter, use the TRUNC() function instead.

Return type: int

from_unixtime(bigint unixtime[, string format])
Purpose: Converts the number of seconds from the Unix epoch to the specified time into a string.

Return type: string

Usage notes: The format string accepts the variations allowed for the TIMESTAMP data type: date plus time, date by itself, time by itself, and optional fractional seconds for the time. See TIMESTAMP Data Type for details.

Currently, the format string is case-sensitive, especially to distinguish m for minutes and M for months. In Impala 1.3 and higher, you can switch the order of elements, use alternative separator characters, and use a different number of placeholders for each unit. Adding more instances of y, d, H, and so on produces output strings zero-padded to the requested number of characters. The exception is M for months, where M produces a non-padded value such as 3, MM produces a zero-padded value such as 03, MMM produces an abbreviated month name such as Mar, and sequences of 4 or more M are not allowed. A date string including all fields could be "yyyy-MM-dd HH:mm:ss.SSSSSS", "dd/MM/yyyy HH:mm:ss.SSSSSS", "MMM dd, yyyy HH.mm.ss (SSSSSS)" or other combinations of placeholders and separator characters.

  Note: The more flexible format strings allowed with the built-in functions do not change the rules about using CAST() to convert from a string to a TIMESTAMP value. Strings being casted must still have the elements in the specified order and use the specified delimiter characters, as described in TIMESTAMP Data Type.

Examples:

[localhost:21000] > select from_unixtime(1392394861,"yyyy-MM-dd HH:mm:ss.SSSS");
+-------------------------------------------------------+
| from_unixtime(1392394861, 'yyyy-mm-dd hh:mm:ss.ssss') |
+-------------------------------------------------------+
| 2014-02-14 16:21:01.0000                              |
+-------------------------------------------------------+
[localhost:21000] > select from_unixtime(1392394861,"yyyy-MM-dd");
+-----------------------------------------+
| from_unixtime(1392394861, 'yyyy-mm-dd') |
+-----------------------------------------+
| 2014-02-14                              |
+-----------------------------------------+
[localhost:21000] > select from_unixtime(1392394861,"HH:mm:ss.SSSS");
+--------------------------------------------+
| from_unixtime(1392394861, 'hh:mm:ss.ssss') |
+--------------------------------------------+
| 16:21:01.0000                              |
+--------------------------------------------+
[localhost:21000] > select from_unixtime(1392394861,"HH:mm:ss");
+---------------------------------------+
| from_unixtime(1392394861, 'hh:mm:ss') |
+---------------------------------------+
| 16:21:01                              |
+---------------------------------------+
unix_timestamp() and from_unixtime() are often used in combination to convert a TIMESTAMP value into a particular string format. For example:
select from_unixtime(unix_timestamp(now() + interval 3 days), 'yyyy/MM/dd HH:mm');
from_utc_timestamp(timestamp, string timezone)
Purpose: Converts a specified UTC timestamp value into the appropriate value for a specified time zone.

Return type: timestamp

hour(string date)
Purpose: Returns the hour field from a date represented as a string.

Return type: int

hours_add(timestamp date, int hours), hours_add(timestamp date, bigint hours)
Purpose: Returns the specified date and time plus some number of hours.

Return type: timestamp

hours_sub(timestamp date, int hours), hours_sub(timestamp date, bigint hours)
Purpose: Returns the specified date and time minus some number of hours.

Return type: timestamp

microseconds_add(timestamp date, int microseconds), microseconds_add(timestamp date, bigint microseconds)
Purpose: Returns the specified date and time plus some number of microseconds.

Return type: timestamp

microseconds_sub(timestamp date, int microseconds), microseconds_sub(timestamp date, bigint microseconds)
Purpose: Returns the specified date and time minus some number of microseconds.

Return type: timestamp

milliseconds_add(timestamp date, int milliseconds), milliseconds_add(timestamp date, bigint milliseconds)
Purpose: Returns the specified date and time plus some number of milliseconds.

Return type: timestamp

milliseconds_sub(timestamp date, int milliseconds), milliseconds_sub(timestamp date, bigint milliseconds)
Purpose: Returns the specified date and time minus some number of milliseconds.

Return type: timestamp

minute(string date)
Purpose: Returns the minute field from a date represented as a string.

Return type: int

minutes_add(timestamp date, int minutes), minutes_add(timestamp date, bigint minutes)
Purpose: Returns the specified date and time plus some number of minutes.

Return type: timestamp

minutes_sub(timestamp date, int minutes), minutes_sub(timestamp date, bigint minutes)
Purpose: Returns the specified date and time minus some number of minutes.

Return type: timestamp

month(string date)
Purpose: Returns the month field from a date represented as a string.

Return type: int

months_add(timestamp date, int months), months_add(timestamp date, bigint months)
Purpose: Returns the specified date and time plus some number of months.

Return type: timestamp

months_sub(timestamp date, int months), months_sub(timestamp date, bigint months)
Purpose: Returns the specified date and time minus some number of months.

Return type: timestamp

nanoseconds_add(timestamp date, int nanoseconds), nanoseconds_add(timestamp date, bigint nanoseconds)
Purpose: Returns the specified date and time plus some number of nanoseconds.

Return type: timestamp

nanoseconds_sub(timestamp date, int nanoseconds), nanoseconds_sub(timestamp date, bigint nanoseconds)
Purpose: Returns the specified date and time minus some number of nanoseconds.

Return type: timestamp

now()
Purpose: Returns the current date and time (in the UTC time zone) as a timestamp value.

Return type: timestamp

Usage notes: To find a date/time value in the future or the past relative to the current date and time, add or subtract an INTERVAL expression to the return value of now(). See TIMESTAMP Data Type for examples.

second(string date)
Purpose: Returns the second field from a date represented as a string.

Return type: int

seconds_add(timestamp date, int seconds), seconds_add(timestamp date, bigint seconds)
Purpose: Returns the specified date and time plus some number of seconds.

Return type: timestamp

seconds_sub(timestamp date, int seconds), seconds_sub(timestamp date, bigint seconds)
Purpose: Returns the specified date and time minus some number of seconds.

Return type: timestamp

subdate(timestamp startdate, int days), subdate(timestamp startdate, bigint days),
Purpose: Subtracts a specified number of days from a TIMESTAMP value. Similar to date_sub(), but starts with an actual TIMESTAMP value instead of a string that is converted to a TIMESTAMP.

Return type: timestamp

to_date(timestamp)
Purpose: Returns a string representation of the date field from a timestamp value.

Return type: string

to_utc_timestamp(timestamp, string timezone)
Purpose: Converts a specified timestamp value in a specified time zone into the corresponding value for the UTC time zone.

Return type: timestamp

trunc(timestamp, string unit)
Purpose: Strips off fields and optionally rounds a TIMESTAMP value.
Unit argument: The unit argument value is case-sensitive. This argument string can be one of:
  • SYYYY, YYYY, YEAR, SYEAR, YYY, YY, Y: Year (rounds up on July 1).
  • Q: Quarter (rounds up on the sixteenth day of the second month of the quarter).
  • MONTH, MON, MM, RM: Month (rounds up on the sixteenth day).
  • WW, W: Same day of the week as the first day of the month.
  • DDD, DD, J: Day of the month.
  • DAY, DY, D: Starting day of the week.
  • HH, HH12, HH24: Hour. A TIMESTAMP value rounded or truncated to the hour is always represented in 24-hour notation, even for the HH12 argument string.
  • MI: Minute.

Usage notes: Typically used in GROUP BY queries to aggregate results from the same hour, day, week, month, quarter, and so on. You can also use this function in an INSERT ... SELECT into a partitioned table to divide TIMESTAMP values into the correct partition.

Because the return value is a TIMESTAMP, if you cast the result of TRUNC() to STRING, you will often see zeroed-out portions such as 00:00:00 in the time field. If you only need the individual units such as hour, day, month, or year, use the EXTRACT() function instead. If you need the individual units from a truncated TIMESTAMP value, run the TRUNCATE() function on the original value, then run EXTRACT() on the result.

Return type: timestamp

unix_timestamp(), unix_timestamp(string datetime), unix_timestamp(string datetime, string format), unix_timestamp(timestamp datetime)
Purpose: Returns an integer value representing the current date and time as a delta from the Unix epoch, or converts from a specified date and time value represented as a TIMESTAMP or STRING.

Return type: bigint

Usage notes: See from_unixtime() for details about the patterns you can use in the format string to represent the position of year, month, day, and so on in the date string. In Impala 1.3 and higher, you have more flexibility to switch the positions of elements and use different separator characters.

unix_timestamp() and from_unixtime() are often used in combination to convert a TIMESTAMP value into a particular string format. For example:
select from_unixtime(unix_timestamp(now() + interval 3 days), 'yyyy/MM/dd HH:mm');
weekofyear(string date)
Purpose: Returns the corresponding week (1-53) from a date represented as a string.

Return type: int

weeks_add(timestamp date, int weeks), weeks_add(timestamp date, bigint weeks)
Purpose: Returns the specified date and time plus some number of weeks.

Return type: timestamp

weeks_sub(timestamp date, int weeks), weeks_sub(timestamp date, bigint weeks)
Purpose: Returns the specified date and time minus some number of weeks.

Return type: timestamp

year(string date)
Purpose: Returns the year field from a date represented as a string.

Return type: int

years_add(timestamp date, int years), years_add(timestamp date, bigint years)
Purpose: Returns the specified date and time plus some number of years.

Return type: timestamp

years_sub(timestamp date, int years), years_sub(timestamp date, bigint years)
Purpose: Returns the specified date and time minus some number of years.

Return type: timestamp