I used kite sdk so basically, all the fields on the avro schema are mapped to HBase columns. So, I have and HBase table with 3 columns (b, c and d).
When I insert a record on my HBase table via CLI, the data is stored on my HBase columns. Then, I use Hive to visualize the data stored on my HBase table (via HBase handler). I can successfully visualize b and c columns (with "hbase.table.default.storage.type" = "binary”) but the column d is displayed as NULL (Hive returns NULL when can’t convert the data).
I think the issue here is the way data is encoded before it is stored on HBase. I read that the int, long and String types are encoded by kite (1) but the other types are “avro-serialized” (2) with an especial encoding (variable-lengthzig-zag coding).
I did a scan on my table and for and input of 130 on my d column, the value stored is: “\x02\x84\x02”. This is coherent with the explanation given for union [null, int] coding on avro: index of null + index of int + int coding.
(130 on Zigzag-coding) = 260
(260 on binary) = 0000010 0000100
(260 on binary) + (Variable-length) = 10000100 00000010
Result of 3) on hex = 84 02
index of null + index of int + int coding = 00 + 02 + 84 02 = 00 02 84 02
So, the data stored has a meaning but I can’t visualize it correctly.
Am I doing something wrong? Where can I find something to help me solve this issue? Is that an kite problem or an hive problem?