Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Creating a Hive managed HBASE table - Insert command overwrites the data

Creating a Hive managed HBASE table - Insert command overwrites the data

Expert Contributor

1.Create a hive-managed HBASE table CREATE TABLE MyHBaseTable(MyKey string, Col1 string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,colfam:col1")TBLPROPERTIES("hbase.table.name" = "t2");

where MyHBASETABLE - Creating a hive table "hbase.table.name" = "t2" - t2 is the HBASE table (new table -auto create)

2. INSERT INTO TABLE MyHBaseTable SELECT eid, name FROM employee_100;

Doubt : Above command is overwriting the data available in MyHBaseTable . But I am expecting append the data . Please help here if you have gone thru with this issue.

3 REPLIES 3

Re: Creating a Hive managed HBASE table - Insert command overwrites the data

You have defined the rowkey in HBase to be the Hive column "MyKey". If you want new rows, make sure that you use a unique rowkey.

Re: Creating a Hive managed HBASE table - Insert command overwrites the data

New Contributor

@Amit Dass HBase will not store duplicate keys. If you do repeated INSERT INTO SELECT FROM statements, you will simply overwrite your data. You can, however, increase the number of versions of records that HBase stores. To keep, 5 versions of your data, do:

alter 't2', NAME => 'colfam', VERSIONS => 5

Re: Creating a Hive managed HBASE table - Insert command overwrites the data

You can, but from Hive you will still see only the latest version: there is currently no way to access the HBase timestamp attribute, and queries always access data with the latest timestamp, a quote from here.

Don't have an account?
Coming from Hortonworks? Activate your account here