Reply
Former Member
Posts: 0

Impala - Timestamp - order by works?

Hi,

 

I have a table in Hbase and want to access through impala.

 

The table has a row key combination of three columns (abcshsk#2017-08-01 00:00:00.0#cpu)

 

Three columns (string,timestamp,string) this is my row key now.

 

 

I have to write a Impala query to display column1, timestamp as Day,column 2 from table name where condition 1 and condition 2 order by day,timestamp.

 

 

The question is how to cast a timestamp in impala to date?

 

Cast(timestamp,date) is not there. how to do?

 

I used to_date(timestamp)

 

I can able to display Day from the query result.

 

condition two is date range.

 

how to select date ranges using timestamp( date range between) ?

 

how to order by Day and to_date(timestamp) ?

 

 

will order by work in Hbase? whatever column in order by i use i am not able to order.

 

I tried order by a double data type column also.

 

 

any condition in impala to use order by in hbase?

 

 

pls suggest. Thanks.

Cloudera Employee
Posts: 416
Registered: ‎07-29-2015

Re: Impala - Timestamp - order by works?

Impala doesn't have a date data type, it does have a timestamp type though: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_timestamp.html. to_date() converts timestamp to a string.https://www.cloudera.com/documentation/enterprise/latest/topics/impala_datetime_functions.html

 

If you want to remove the time portion of the timestamp you can use trunc(ts, "DD") to get the timestamp of midnight on that day: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_datetime_functions.html

 

I don't understand your question about "order by". Order by should work for all scalar data types in Impala (timestamp, double, etc). What do you want it to do and what is it doing now?

Former Member
Posts: 0

Re: Impala - Timestamp - order by works?

I have timestamp column , I want to order by date,day.

>From timestamp I applied logic and converted as day.

So I need to order by date. The timestamp column has date and time.

But I need to order by date.

It's not working

My record size is one hundred sixty million records.

Database is in hbase.
Cloudera Employee
Posts: 416
Registered: ‎07-29-2015

Re: Impala - Timestamp - order by works?

I'd expect ORDER BY trunc(ts, "DD") to work. E.g. on my system this works:

 

[localhost:21000] > select timestamp_col, tinyint_col from functional_hbase.alltypestiny order by trunc(timestamp_col, 'DD'), tinyint_col desc;
+---------------------+-------------+
| timestamp_col       | tinyint_col |
+---------------------+-------------+
| 2009-01-01 00:01:00 | 1           |
| 2009-01-01 00:00:00 | 0           |
| 2009-02-01 00:01:00 | 1           |
| 2009-02-01 00:00:00 | 0           |
| 2009-03-01 00:01:00 | 1           |
| 2009-03-01 00:00:00 | 0           |
| 2009-04-01 00:01:00 | 1           |
| 2009-04-01 00:00:00 | 0           |
+---------------------+-------------+
Former Member
Posts: 0

Re: Impala - Timestamp - order by works?

Hi,

 

 

Hbase table

 

table in hbase

 

hbid is combination of 3 columns. two string and one timestamp column

anhdu6jbc#cpu#2017-08-01 00:00:09.0= hostname#matrix__timestamp

hbid- anhdu6jbc#cpu__2017-08-01 00:00:09.0

country-string-addf

matrix_value- double-8.9

 

select extract(split_part(hbid,'_',3),"day") as Day,split_part(split_part(hbid,'_',2),'#',1) as Hostname,split_part(hbid,'_',3) as starttime,matrix_value from table where split_part(split_part(hbid,'_',2),'#',2)='cpu'and matrix_value <=100 and split_part(split_part(hbid,'_',2),'#',1)='2jdnca' and (to_date(split_part(hbid,'_',3)) >= '2017-08-01' and to_date(split_part(hbid,'_',3)) <= '2017-08-08') order by matrix_value limit 10

 

Exception

 

  • RetriesExhaustedException: Failed after attempts=3, exceptions: Tue Sep 26 10:17:39 SGT 2017, RpcRetryingCaller{globalStartTime=1506392256600, pause=100, retries=3}, java.io.IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483400, waitTime=3001, operationTimeout=3000 expired. Tue Sep 26 10:17:42 SGT 2017, RpcRetryingCaller{globalStartTime=1506392256600, pause=100, retries=3}, java.io.IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483402, waitTime=3001, operationTimeout=3000 expired. Tue Sep 26 10:17:46 SGT 2017, RpcRetryingCaller{globalStartTime=1506392256600, pause=100, retries=3}, java.io.IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483404, waitTime=3001, operationTimeout=3000 expired. CAUSED BY: IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483404, waitTime=3001, operationTimeout=3000 expired. CAUSED BY: CallTimeoutException: Call id=1483404, waitTime=3001, operationTimeout=3000 expired.

select extract(split_part(hbid,'_',3),"day") as Day,split_part(split_part(hbid,'_',2),'#',1) as Hostname,split_part(hbid,'_',3) as starttime,matrix_value from tablename where split_part(split_part(hbid,'_',2),'#',2)='cpu'and matrix_value <=100 and split_part(split_part(hbid,'_',2),'#',1)='shdjd1d' order by trunc(split_part(hbid,'_',3),'DD')  

 

Exception

 

  • RetriesExhaustedException: Failed after attempts=3, exceptions: Tue Sep 26 10:07:33 SGT 2017, RpcRetryingCaller{globalStartTime=1506391650523, pause=100, retries=3}, java.io.IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483303, waitTime=3001, operationTimeout=3000 expired. Tue Sep 26 10:07:36 SGT 2017, RpcRetryingCaller{globalStartTime=1506391650523, pause=100, retries=3}, java.io.IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483305, waitTime=3001, operationTimeout=3000 expired. Tue Sep 26 10:07:40 SGT 2017, RpcRetryingCaller{globalStartTime=1506391650523, pause=100, retries=3}, java.io.IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483307, waitTime=3001, operationTimeout=3000 expired. CAUSED BY: IOException: Call to x01shdpeapp4a.sgp.dbs.com/10.92.139.146:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1483307, waitTime=3001, operationTimeout=3000 expired. CAUSED BY: CallTimeoutException: Call id=1483307, waitTime=3001, operationTimeout=3000 expired.