02-18-2014 01:26 AM - edited 02-18-2014 04:35 AM
I would like to use Excel in order to access data in HBase.
For this reason I intend to use an ODBC connection via Impala.
I linked a table in Hive to the respective table in HBase.
This works fine, I can access all the table's content in Hive.
Unfortunately, I cannot access the table's content in Impala.
The tables are available but the content is not.
Impala seems to have an issue with those "mapped" columns.
I tried 'invalidate metadata' which did not have any effect on the content.
The impala shell throws the following exception when trying to access it:
Query: select * from ts
ERROR: IllegalStateException: null
I followed Cloudera's guide on this matter:
Details on the HBase table:
it is a very wide table with three rows and thousands of columns.
In Hive, the table is mapped using the methods described here: http://www.bidn.com/blogs/cprice1979/ssas/4608/introduction-to-hive-collections
Have I missed anything in order to synchronize/link the Impala and Hive table?
Or is there a better approach for my use case?
Thank you very much in advance.
PS: Structure of the Hive table:
CREATE EXTERNAL TABLE ts( id string, props map<bigint,float> ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key, cf:" ) TBLPROPERTIES("hbase.table.name" = "ts");
Impala Error Log:
Session ID: 384a71b4949fb90f:50a8f595c08831b8
Session Type: BEESWAX
Start Time: 2014-02-18 13:34:48.398540000
End Time: 2014-02-18 13:34:48.402393000
Query Type: N/A
Query State: EXCEPTION
Query Status: IllegalStateException: null
Impala Version: impalad version 1.2.0-cdh5.0.0-beta-1 RELEASE (build ee825cb06b23d3ab97cdd87e13cbbb630bd75b98)
Network Address: 10.115.106.141:34364
Default Db: default
Sql Statement: select * from ts
Query Timeline: 4.76ms
- Start execution: 1.356ms (1.356ms)
- Unregister query: 3.847ms (2.491ms)