Reply
Highlighted
Expert Contributor
Posts: 152
Registered: ‎07-01-2015

Impala timestamp conversion from bigint

Hi,

  we are using sqoop to transfer tables with timestamp columns into Parquet files. Based on the Cloudera recommendation we are reading those external tables as bigint and dividing the timestamp value by 1000 and converting to timestamp. 

 

The problem is that this conversion leads to unprecise timestamp values. Somehow Impala considers the result of the division as float data type and thus the conversion to Timestamp value results to a timestamp with a few nanoseconds difference. 

 

Here is an example:

 

select now(), cast( now() as double), cast( cast(now() as double) as timestamp);

 

time1.PNG

In the result you can see, that now() DOES NOT EQUAL to cast( cast( now() as double) as timestamp)

 

 

Casting the value to decimal do not help either:

 

select cast( cast( 1470312619.123 as decimal(20,3)) as timestamp);

time3.PNG

 

The result should have 123 milliseconds, but have a little less.

 

 

Is there a way to reliable convert a BIGINT timestamp with milliseconds to a Impala TIMESTAMP data type without loss of precision?

Thanks

 

 

 

Explorer
Posts: 20
Registered: ‎03-31-2017

Re: Impala timestamp conversion from bigint

I'm aware this is a very old question - but did you folks ever resolve this? When sqooping from an SQLServer instance to parquet files and then going throug the subsequent conversion steps, all of my times are off by about 5-6 hours.

 

For example, looking at two fields with the same ID in each data source:

 

SOURCE: | 56723157200011 | 2016-02-11 00:00:00 |

IMPALA: | 56723157200011 | 2016-02-11 06:00:00 |

 

 

Conversion process was (BIGINT/1000 AS TIMESTAMP).

 

 

 

Cloudera Employee
Posts: 278
Registered: ‎07-29-2015

Re: Impala timestamp conversion from bigint

@Tomas79sorry you didn't get a response to this question. It looks like the root cause of your problem could have been something like https://issues.apache.org/jira/browse/IMPALA-3437 . If you cast one of the input arguments to a decimal data type then the output will be decimal and you will get a precise response.

Cloudera Employee
Posts: 278
Registered: ‎07-29-2015

Re: Impala timestamp conversion from bigint

@mtrepanierthe 6 hour difference seems like a timezone conversion issue. My guess is that there was a missing or an undesired timezone conversion in there somewhere:

 

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_timestamp.html#timestamp

 

By default casting an integer to a timestamp will give you a timestamp in UTC time.

Explorer
Posts: 20
Registered: ‎03-31-2017

Re: Impala timestamp conversion from bigint

[ Edited ]

Appreciate the reponse Tim - I believe that was the cause.

Expert Contributor
Posts: 152
Registered: ‎07-01-2015

Re: Impala timestamp conversion from bigint

Definitely, I had also a problem with 1-2 hours shift, this is because of the timezone difference and because of summer/winter time. But this can be easily fixed by cast and converting from UTC time.
Announcements