I would like to use Excel in order to access data in HBase.
For this reason I intend to use an ODBC connection via Impala.
I linked a table in Hive to the respective table in HBase.
This works fine, I can access all the table's content in Hive.
Unfortunately, I cannot access the table's content in Impala.
The tables are available but the content is not.
Impala seems to have an issue with those "mapped" columns.
I tried 'invalidate metadata' which did not have any effect on the content.
The impala shell throws the following exception when trying to access it:
Query: select * from ts
ERROR: IllegalStateException: null
I followed Cloudera's guide on this matter:
Details on the HBase table:
it is a very wide table with three rows and thousands of columns.
In Hive, the table is mapped using the methods described here: http://www.bidn.com/blogs/cprice1979/ssas/4608/introduction-to-hive-collections
Have I missed anything in order to synchronize/link the Impala and Hive table?
Or is there a better approach for my use case?
Thank you very much in advance.
PS: Structure of the Hive table:
CREATE EXTERNAL TABLE ts( id string, props map<bigint,float> ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key, cf:" ) TBLPROPERTIES("hbase.table.name" = "ts");
Impala Error Log:
Query (id=474ff9a95886906e:6db3b00572c6d69b):Summary:Session ID: 384a71b4949fb90f:50a8f595c08831b8Session Type: BEESWAXStart Time: 2014-02-18 13:34:48.398540000End Time: 2014-02-18 13:34:48.402393000Query Type: N/AQuery State: EXCEPTIONQuery Status: IllegalStateException: nullImpala Version: impalad version 1.2.0-cdh5.0.0-beta-1 RELEASE (build ee825cb06b23d3ab97cdd87e13cbbb630bd75b98)User: graeblefNetwork Address: 10.115.106.141:34364Default Db: defaultSql Statement: select * from tsQuery Timeline: 4.76ms- Start execution: 1.356ms (1.356ms)- Unregister query: 3.847ms (2.491ms)