02-12-2016 06:59 AM
I have the following avro schema “user.avsc”:
I used kite sdk so basically, all the fields on the avro schema are mapped to HBase columns. So, I have and HBase table with 3 columns (b, c and d).
When I insert a record on my HBase table via CLI, the data is stored on my HBase columns. Then, I use Hive to visualize the data stored on my HBase table (via HBase handler). I can successfully visualize b and c columns (with "hbase.table.default.storage.type" = "binary”) but the column d is displayed as NULL (Hive returns NULL when can’t convert the data).
I think the issue here is the way data is encoded before it is stored on HBase. I read that the int, long and String types are encoded by kite (1) but the other types are “avro-serialized” (2) with an especial encoding (variable-length zig-zag coding).
I did a scan on my table and for and input of 130 on my d column, the value stored is: “\x02\x84\x02”. This is coherent with the explanation given for union [null, int] coding on avro: index of null + index of int + int coding.
index of null + index of int + int coding = 00 + 02 + 84 02 = 00 02 84 02
So, the data stored has a meaning but I can’t visualize it correctly.
Am I doing something wrong? Where can I find something to help me solve this issue?Is that an kite problem or an hive problem?
Help would be appreciated. Thanks on advance.