- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
I need to get data from hbase table for a given date range
- Labels:
-
Apache HBase
Created on ‎02-16-2017 01:19 PM - edited ‎09-16-2022 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using below scan statement in Hbase to obtain data from a table wherein the column A30 should be between the given date range.How can this be achieved, as below query pulls up record beyond range of 01-JUN-2011 and 30-JUN-2011.
scan 'mytable',{FILTER => "SingleColumnValueFilter('cf1','A15',=, 'binary:10') AND SingleColumnValueFilter('cf1','F49' ,=, 'binary:6') AND SingleColumnValueFilter('cf1','A30',>=, 'binary:01-JUN-2011') AND SingleColumnValueFilter('cf1','A30',<=, 'binary:30-JUN-2011')"}
Data in hbase table ;-
column=cf1:A30, timestamp=1487080632072, value=15-JAN-14
Created ‎02-16-2017 06:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HBase stores data is stored in bytes. The date you have stored 15-JAN-14 is definitely less than 30-JUN-2011 lexicographically. It is however, smaller than 01-JUN-2011 because JAN is less than JUN (at least that's what it seems). May be put the two date condition in brackets. But, please remember, what you are doing is not really a date comparison as far as HBase is concerned. It is a string comparison.
if you want to do proper date comparisons. then you should store timestamps instead of date as a string. or use this format "20110630".
Created ‎02-19-2017 02:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
actually on a second thought, I realize 15-JAN-14 is also greater than 01-JUN-2011 because 15 is greater than 01 and it is less than 30-JUN-2011 because 15 is less than 30. HBase is behaving exactly how it is supposed to behave. You need to store date in a different format, like YYYYMMDD. When you do that, 20140115 does not fall between 20110601 and 20110630 lexicographically and you will get your expected result.
Another way is to convert Date to timestamp.
