Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive-HBase Integration: Not possible to create HFiles for more than one Row

Highlighted

Hive-HBase Integration: Not possible to create HFiles for more than one Row

Expert Contributor

I created an external Hive Table that refers to a corresponding HBase table (which has one column family "cf"):

create external table testdb.testtable ( rowkey String,  hashvalue String,  valuelist String)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:valueList,cf:hashValue")
TBLPROPERTIES ('hbase.table.name' = 'hbase_table')

Now I want to create HFiles from an existing Hive table (origin_table), which has more than one column (rokey (=HBase row key), hashvalue, valuelist).

When I run the following query to create HFiles from the Hive table, I always get an exception.

Query:

set hfile.family.path=/tmp/testtable/cf
set hive.hbase.generatehfiles=true

INSERT OVERWRITE TABLE testdb.testtable 
SELECT k, valuelist, hashvalue from testdb.origin_table DISTRIBUTE BY k SORT BY k;

Exception:

java.io.IOException: Added a key not lexically larger than previous. 
Current cell = 055:test_2018-08-28 09:09:31/cf:hashValue/1536343090127/Put/vlen=3/seqid=0, 
    lastCell = 055:test_2018-08-28 09:09:31/cf:valueList/1536343090127/Put/vlen=11417/seqid=0

This seems to occur because of the different column names (hashValue und valueList). There's no difference if I exchange the columns of the query (SELECT k, hashvalue, valuelist FROM ...), it also throws the exception!

Of course, when change this example into a Hive table with only one column (+key column), the INSERT command works, as there is no other column to read out for the HFile creation.

Question now: How can I create HFiles with this Hive-HBase integration, if there is more than one column (+key) to transfer?