Now I want to create HFiles from an existing Hive table (origin_table), which has more than one column (rokey (=HBase row key), hashvalue, valuelist).
When I run the following query to create HFiles from the Hive table, I always get an exception.
INSERT OVERWRITE TABLE testdb.testtable SELECT k, valuelist, hashvalue from testdb.origin_table DISTRIBUTE BY k SORT BY k;
java.io.IOException: Added a key not lexically larger than previous.
Current cell = 055:test_2018-08-28 09:09:31/cf:hashValue/1536343090127/Put/vlen=3/seqid=0,
lastCell = 055:test_2018-08-28 09:09:31/cf:valueList/1536343090127/Put/vlen=11417/seqid=0
This seems to occur because of the different column names (hashValue und valueList). There's no difference if I exchange the columns of the query (SELECT k, hashvalue, valuelist FROM ...), it also throws the exception!
Of course, when change this example into a Hive table with only one column (+key column), the INSERT command works, as there is no other column to read out for the HFile creation.
Question now: How can I create HFiles with this Hive-HBase integration, if there is more than one column (+key) to transfer?