I am trying to integrate Hive with an HBase instance that uses Avro to store data. I have followed the procedure described in the documentation here. In HBase I have one table called 'events' with one column family (masterdata) and one payload column (payload).
The stack is running on Hortonworks 2.4:
The procedure to create the table in Hive is stored as setup.hql:
CREATE EXTERNAL TABLE masterdata_events
ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,masterdata:payload",
"masterdata.payload.avro.schema.url" = 'hdfs://cluster/schema.avsc',
"hive.serialization.extend.nesting.levels" = "true"
TBLPROPERTIES ("hbase.table.name" = "events", "hbase.struct.autogenerate"="true");
I run it with 'hive -f setup.hql'. The output is as follows:
Time taken: 2.625 seconds
Time taken: 3.301 seconds
Time taken: 0.066 seconds
FAILED: IllegalArgumentException Error: : expected at the end of 'string:struct<product:struct<productmain:struct<id:string,...so on and so on ...>'
Now, I would like to point out that the amount of characters between the two single quotes is 4008. After a search on google, I have found an answer from another user in this community stating that you have to increase the size of the SERDE_PARAMS in the Hive Metadata store. Now, despite the workaround, I am still experiencing the same issue. I have also tried to use "masterdata.payload.avro.schema.literal" as well, but nothing has changed. The avro schema is valid json (verified).