I have successfully implemented the logic provided under the link https://community.hortonworks.com/articles/2745/creating-hbase-hfiles-from-an-existing-hive-table.ht....
However, this example corresponds to a HBase table with a single column family. How can I go ahead with HBase table with multiple column families using similar examples?
Need you help.
Thanks and Regards,
this link light have more info for you : http://hortonworks.com/blog/hbase-via-hive-part-1/
essentially in the serde properties you will map hive columns to hbase columns including column families: Here column families are depicted fy 'f' so cf1:c1 is colmun family : cf1 and column c1
WITH SERDEPROPERTIES (‘hbase.columns.mapping’ = ‘:key,f:c1,f:c2’)
so if you did something like this assuming the column names coincide
CREATE TABLE foo(rowkey STRING, a STRING, b STRING) STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’ WITH SERDEPROPERTIES (‘hbase.columns.mapping’ = ‘:key,cf1:a,cf2:b’) TBLPROPERTIES (‘hbase.table.name’ = ‘bar’);
hope this helps
It's not possible using the scenario with completebulkload because HBase storage handler in Hive is using old HFileOutputFormat.
I have done the following steps, but stuck at the last step. Need your help on that.
1) Created a csv with 5 records
2) Created an external table in Hive pointing to the HDFS directory containing CSV
3) Created a Hive-HBase table with below DDL
CREATE TABLE hbase_cdc_poc.hbase_warehouse (w_warehouse_sk int, w_warehouse_id char(16), w_warehouse_name varchar(20), w_warehouse_sq_ft int, w_street_number char(10), w_street_name varchar(60), w_street_type char(15), w_suite_number char(10), w_city varchar(60), w_county varchar(30), w_state char(2), w_zip char(10), w_country varchar(20), w_gmt_offset decimal(5,2) ) stored as INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileOutputFormat' TBLPROPERTIES ('hfile.family.path' = '/user/tcs_ge_user/wrhs_hfiles/cf1');
4) Loaded data into above table using below and verified the count. It matched
insert overwrite table hbase_cdc_poc.hbase_warehouse select * from hbase_cdc_poc.warehouse cluster by w_warehouse_sk;
5) HFile also got created at the path /user/tcs_ge_user/wrhs_hfiles/cf1. Screenshot attached.
6) Now using completebulkload to load data in HBase. Table warehouse is created in HBase already.
yarn jar /usr/hdp/current/hbase-client/lib/hbase-server.jar completebulkload /user/tcs_ge_user/wrhs_hfiles/cf1 warehouse
But while executing this command it gives me below error. Screenshot attached. Need your URGENT help in this.
@rajdip chaudhuri You should not include column family in the path for completebulkload. So, try
yarn jar /usr/hdp/current/hbase-client/lib/hbase-server.jar completebulkload /user/tcs_ge_user/wrhs_hfiles warehouse