Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Year 2038 problem in Impala

avatar
Explorer

Is this a known issue with Impala?

 

unix_timestamp is apparently not working properly with dates beyond the year 2038.  In contrast, Hive appears to not have this problem.

 

[cloudera@quickstart ~]$ impala-shell
Starting Impala Shell without Kerberos authentication
Connected to quickstart.cloudera:21000
Server version: impalad version 2.1.0-cdh5 RELEASE (build e48c2b48c53ea9601b8f47a39373aa83ff7ca6e2)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.

Copyright (c) 2012 Cloudera, Inc. All rights reserved.

(Shell build version: Impala Shell v2.1.0-cdh5 (e48c2b4) built on Tue Dec 16 19:00:35 PST 2014)
[quickstart.cloudera:21000] > select  to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a,
                            >         to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b,
                            >         to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c;
Query: select to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a,
to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b,
to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c
+------------+------------+------------+
| a          | b          | c          |
+------------+------------+------------+
| 2036-06-14 | 2037-06-14 | 1902-05-08 |
+------------+------------+------------+
Fetched 1 row(s) in 0.41s
[quickstart.cloudera:21000] >

 

Here is the same query in Hive:

 

[cloudera@quickstart ~]$ hive

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
hive> select  to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a,
    >         to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b,
    >         to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1425573450783_0058, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1425573450783_0058/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1425573450783_0058
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-03-13 11:03:43,809 Stage-1 map = 0%,  reduce = 0%
2015-03-13 11:03:55,265 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.71 sec
MapReduce Total cumulative CPU time: 1 seconds 710 msec
Ended Job = job_1425573450783_0058
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 1.71 sec   HDFS Read: 284 HDFS Write: 33 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 710 msec
OK
2036-06-14 2037-06-14 2038-06-14
Time taken: 29.0 seconds, Fetched: 1 row(s)

 

Thanks

 

Morgan

 

 

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Yes, it is a known issue. https://issues.cloudera.org/browse/IMPALA-1579 tracks it, and a fix will be in the next Impala release in CDH5.4.

 

Best,

Henry

View solution in original post

4 REPLIES 4

avatar
Contributor

Isn't that more of a general Unix time problem?

avatar
Explorer

Perhaps it is, but as I showed in the post above, Hive works correctly, but Impala fails.

 

Thanks

 

Morgan

avatar
Expert Contributor

Yes, it is a known issue. https://issues.cloudera.org/browse/IMPALA-1579 tracks it, and a fix will be in the next Impala release in CDH5.4.

 

Best,

Henry

avatar
Explorer

Henry,

 

Thanks for the info.

 

Morgan