Created 05-21-2020 09:33 PM
I am working on a simple code to get a row from HBase which is not working through HBase Java API but the same row is retrievable from the shell.
Example row Ids:
Result from one of the RowId through shell:
hbase(main):003:0> get 'OSMNodes_z3_geometry_ingestionTimestamp_v6',"\x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#5241968320"
COLUMN CELL
d: timestamp=1588532276719, value=\x03\x00\x06\x02\x00\x12\x00-\x005\x00=\x00R\x00Y\x01x\x00\x00\x0
0\x00\x01\x08\x03\xC0V\xE9\x9AU\x81;\xD5\xBF\xF5\xA7FE^\xAE\xE2\x7F\xF8\x00\x00\x00\x00\x00\x00\
x00\x00\x01q\x9Av\xA9\x00\x00\x00\x01q\x9B\x89Q\x801447846881#524196832\xB0geojso\xEE\xDE\x04{"g
eometry":{"type":"Point","coordinates":[-91.6500448,-1.3533385]},"id":"5241968320","properties":
{"tags":{},"changesetId":53988821,"version":1,"uid":1975220,"user":"jptolosa87","featureSource":
"OSM","sourceTimestamp":"2017-11-22 00:01:26","ingestionTimestamp":"2020-04-21 02:00:00"}}
Java API Code
public static void main(String[] args) {
Configuration conf = HBaseConfiguration.create();
conf.addResource(new Path("/etc/hbase/conf/hbase-default.xml"));
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
try {
HBaseAdmin.checkHBaseAvailable(conf);
} catch (Exception e) {
e.printStackTrace();
return;
}
Map<String, String> argVector = parseArgVector(args);
final String fileName = argVector.get("fileName");
final String hbaseTableName = argVector.get("hbaseTableName");
final String rowId = argVector.get("rowId");
final HTable table = getTable(hbaseTableName, conf);
Result result = getRow(table, rowId);
System.out.println(result);
System.out.println(result.isEmpty());
System.out.println(Bytes.toString(result.getRow()));
System.out.println(Bytes.toString(result.getValue(Bytes.toBytes("d"), Bytes.toBytes(""))));
}
private static Result getRow(final HTable table, final String rowId) {
try {
Get get = new Get(Bytes.toBytes(rowId));
System.out.println(Bytes.toString(get.getRow()));
return table.get(get);
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
Created 05-27-2020 05:40 PM
The rowId used in our HBase tables is not exactly composed of hex strings. It is a mix as pointed out in earlier correspondence of this thread.
\x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#524196832
The solution to this, as I was able to find, was to use the function provided by the Bytes library called toBytesBinary.
This method considers the hex representation of characters in a string and treat each of those as unit instead of breaking it further up.
Hope this helps!!
Created 05-22-2020 11:28 AM
Some additional information you could provide to help the community answer the question:
Are there any errors that Java returns when querying HBase or does it just silently not show any rows?
Is the same user executing both tasks (through shell and java)?
Can any other rows be retrieved from Java?
Created 05-26-2020 10:27 AM
Are there any errors that Java returns when querying HBase or does it just silently not show any rows?
> The return is null in the Result object
Is the same user executing both tasks (through shell and java)?
> Yes
Can any other rows be retrieved from Java?
> From a different table, which has comparatively less complex row key.
Created 05-26-2020 02:52 PM
Perhaps something about how Java interprets the args you pass to it when you run your code? It may be different from how shell client interprets them (relevant discussion here).
Can you show how the command that executes your Java code, complete with the arguments passed to it?
Also, include printed arguments (e.g. System.out.println(rowId)) in your code.
Execute the code for the same key as you did in shell (i.e. \x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#5241968320)
Created 05-26-2020 03:42 PM
Can you show how the command that executes your Java code, complete with the arguments passed to it?
> java -cp hadoop-lib/*:hbase-lib/*: GetFeature --hbaseTableName "OSMNodes_z3_geometry_ingestionTimestamp_v6" --rowId "\x00\x0A@E\xFFn[\x18
\x9F\xD4-1447846881#5241968320"
Also, include printed arguments (e.g. System.out.println(rowId)) in your code.
> Table name: OSMNodes_z3_geometry_ingestionTimestamp_v6
> rowId: \x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#5241968320
Execute the code for the same key as you did in shell (i.e. \x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#5241968320)
> There is a different behavior when I execute the get for this rowId depending upon the quotes.
> If I use a double quote around the mentioned rowId, I get a different row than the one I get If I use a single quote around.
> Example (Look out at the column name for distinction)
hbase(main):027:0> get 'OSMNodes_z3_geometry_ingestionTimestamp_v6',"\x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#5241968320"
COLUMN CELL
d: timestamp=1590532417003, value=\x5Cx03\x5Cx00\x5Cx06\x5Cx02\x5Cx00\x5Cx12\x5Cx00-\x5Cx00
5\x5Cx00=\x5Cx00R\x5Cx00Y\x5Cx01x\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx01\x5Cx08\x5Cx03\x5CxC
0V\x5CxE9\x5Cx9AU\x5Cx81;\x5CxD5\x5CxBF\x5CxF5\x5CxA7FE^\x5CxAE\x5CxE2\x5Cx7F\x5CxF8\x5C
x00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx01q\x5Cx9Av\x5CxA9\x5Cx00\x5Cx0
0\x5Cx00\x5Cx01q\x5Cx9B\x5Cx89Q\x5Cx801447846881#524196832\x5CxB0geojso\x5CxEE\x5CxDE\x5
Cx04{"geometry":{"type":"Point","coordinates":[-91.6500448,-1.3533385]},"id":"5241968320
","properties":{"tags":{},"changesetId":53988821,"version":1,"uid":1975220,"user":"jptol
osa87","featureSource":"OSM","sourceTimestamp":"2017-11-22 00:01:26","ingestionTimestamp
":"2020-04-21 02:00:00"}}
1 row(s) in 0.0070 seconds
hbase(main):028:0> get 'OSMNodes_z3_geometry_ingestionTimestamp_v6','\x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#5241968320'
COLUMN CELL
d:d timestamp=1590532440913, value=\x5Cx03\x5Cx00\x5Cx06\x5Cx02\x5Cx00\x5Cx12\x5Cx00-\x5Cx00
5\x5Cx00=\x5Cx00R\x5Cx00Y\x5Cx01x\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx01\x5Cx08\x5Cx03\x5CxC
0V\x5CxE9\x5Cx9AU\x5Cx81;\x5CxD5\x5CxBF\x5CxF5\x5CxA7FE^\x5CxAE\x5CxE2\x5Cx7F\x5CxF8\x5C
x00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx00\x5Cx01q\x5Cx9Av\x5CxA9\x5Cx00\x5Cx0
0\x5Cx00\x5Cx01q\x5Cx9B\x5Cx89Q\x5Cx801447846881#524196832\x5CxB0geojso\x5CxEE\x5CxDE\x5
Cx04{"geometry":{"type":"Point","coordinates":[-91.6500448,-1.3533385]},"id":"5241968320
","properties":{"tags":{},"changesetId":53988821,"version":1,"uid":1975220,"user":"jptol
osa87","featureSource":"OSM","sourceTimestamp":"2017-11-22 00:01:26","ingestionTimestamp
":"2020-04-21 02:00:00"}}
1 row(s) in 0.0060 seconds
hbase(main):029:0>
Created 05-26-2020 07:48 PM
Ok, so regarding single quotes vs. double quotes, you have to use double quotes in shell every time. Text in single quotes is treated as liternal (see p.271 of HBase Definitive Guide).
After some more research I came across this post which seems to describe your problem exactly, along with two solutions on how to modify your Java code. To summarize, Java client for HBase expects row keys to be in human readable format, not their hexadecimal representation. Solution is to read your args as Double type, not String.
Hope that finally resolves it.
Created 05-27-2020 05:40 PM
The rowId used in our HBase tables is not exactly composed of hex strings. It is a mix as pointed out in earlier correspondence of this thread.
\x00\x0A@E\xFFn[\x18\x9F\xD4-1447846881#524196832
The solution to this, as I was able to find, was to use the function provided by the Bytes library called toBytesBinary.
This method considers the hex representation of characters in a string and treat each of those as unit instead of breaking it further up.
Hope this helps!!