Currently, scan only returns the columns updated within the time range. But I need the entire row with other columns as well. How do I do that? Here is the snippet of my code. Please help!
Scan scan = new Scan(); scan.setTimeRange(1471710010773L, System.currentTimeMillis());
What's the value for VERSIONS attribute of the table ?
How many rows do you expect the scan to return within the timerange ?
Thanks a lot for the quick reply. Appreciated it.
Basically what I am looking is if any of the columns are not updated within the time range for any given row, those rows should not be returned in the results. This is working without any issue.
But if any of the columns are updated, I would like to see the entire row including other non-updated columns in the results instead of just returning the ones updated.
What's the value for VERSIONS attribute of the table ? : It has multiple versions.
How many rows do you expect the scan to return within the timerange ? Rows are being returned correctly, but I am not getting other columns that were not updated.
Workaround would be to issue Get's given the row keys retrieved from the Scan.
Use the following API from(H)Table:
Result get(List<Get> gets) throws IOException;
Thanks for the reply again. Is it possible to achieve with a single query because we are dealing with millions of rows from the results of the scan?
I need to dig deeper.
You can write an endpoint coprocessor which does retrieval server side but it is non-trivial.
Among the other answers in this thread, you could give Apache Phoenix a shot. That would address your problem and similar problems in the future avoiding expensive development. Just SQL in top of HBase.
You cannot use the Scan's time range filters because as you have guessed HBase is not a row-oriented engine, but a cell-oriented one. The correct approach is to write a Filter which will decide whether to include the whole row or not. For doing that, you can set the timerange in your filter, and override Filter.filterRow() and filterKeyValue() methods and keep the state within the row, and decide to include the row or not based on the Cells matching the timerange or not. You can find example filters to look at in the source code or elsewhere.
Thanks for the reply. Do you have some samples I can take a look? Thanks in advance!