Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Year 2038 problem in Impala

Explorer

Is this a known issue with Impala?

 

unix_timestamp is apparently not working properly with dates beyond the year 2038.  In contrast, Hive appears to not have this problem.

 

[cloudera@quickstart ~]$ impala-shell
Starting Impala Shell without Kerberos authentication
Connected to quickstart.cloudera:21000
Server version: impalad version 2.1.0-cdh5 RELEASE (build e48c2b48c53ea9601b8f47a39373aa83ff7ca6e2)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.

Copyright (c) 2012 Cloudera, Inc. All rights reserved.

(Shell build version: Impala Shell v2.1.0-cdh5 (e48c2b4) built on Tue Dec 16 19:00:35 PST 2014)
[quickstart.cloudera:21000] > select  to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a,
                            >         to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b,
                            >         to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c;
Query: select to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a,
to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b,
to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c
+------------+------------+------------+
| a          | b          | c          |
+------------+------------+------------+
| 2036-06-14 | 2037-06-14 | 1902-05-08 |
+------------+------------+------------+
Fetched 1 row(s) in 0.41s
[quickstart.cloudera:21000] >

 

Here is the same query in Hive:

 

[cloudera@quickstart ~]$ hive

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
hive> select  to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a,
    >         to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b,
    >         to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1425573450783_0058, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1425573450783_0058/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1425573450783_0058
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-03-13 11:03:43,809 Stage-1 map = 0%,  reduce = 0%
2015-03-13 11:03:55,265 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.71 sec
MapReduce Total cumulative CPU time: 1 seconds 710 msec
Ended Job = job_1425573450783_0058
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 1.71 sec   HDFS Read: 284 HDFS Write: 33 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 710 msec
OK
2036-06-14 2037-06-14 2038-06-14
Time taken: 29.0 seconds, Fetched: 1 row(s)

 

Thanks

 

Morgan

 

 

1 ACCEPTED SOLUTION

Rising Star

Yes, it is a known issue. https://issues.cloudera.org/browse/IMPALA-1579 tracks it, and a fix will be in the next Impala release in CDH5.4.

 

Best,

Henry

View solution in original post

4 REPLIES 4

Contributor

Isn't that more of a general Unix time problem?

Explorer

Perhaps it is, but as I showed in the post above, Hive works correctly, but Impala fails.

 

Thanks

 

Morgan

Rising Star

Yes, it is a known issue. https://issues.cloudera.org/browse/IMPALA-1579 tracks it, and a fix will be in the next Impala release in CDH5.4.

 

Best,

Henry

Explorer

Henry,

 

Thanks for the info.

 

Morgan

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.