Reply
New Contributor
Posts: 2
Registered: ‎03-24-2019

Hive HBase snapshot query result can't insert into new table

Hi, we are using CDH 5.16.1 ,hbase-1.2.0,hive 1.1.0  .

I'm trying to load hbase data into hive, but i got an issue when i use hbase snapshot, it only load the row key into hive table. But if i donn't use the snapshot, HbaseStorageHandler can successfully load data into new hive table.

Here are steps i took:

 

first i have created a hbase table:

 

create 'user_data_table','personal_data','professional_data';

 

Then I inserted few records into the table as

put 'user_data_table','user1','personal_data:Location','IL'
put 'user_data_table','user1','personal_data:FName','Deb'
put 'user_data_table','user1','personal_data:LName','D'
put 'user_data_table','user1','professional_data:dept','IT'
put 'user_data_table','user1','professional_data:salary','2000'

put 'user_data_table','user2','personal_data:FName','CH'
put 'user_data_table','user2','personal_data:LName','AK'
put 'user_data_table','user2','professional_data:dept','IT'
put 'user_data_table','user2','professional_data:salary','80000'

 

Then i specify the  snapshot.name to make Hive query Hbase bypasses HBase servers, and directly accesses the underlying files.

set hive.hbase.snapshot.name=snapshot-day-1;
set hive.hbase.snapshot.restoredir=/tmp;

CREATE EXTERNAL TABLE if not exists default.hbase_user_data_snapshot1_table1(key string, Location string,FName string,LName string, dept string,salary string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,personal_data:Location,personal_data:FName,personal_data:LName,professional_data:dept,professional_data:salary")
TBLPROPERTIES ("hbase.table.name" = "user_data_table");

 

 

when i try to load the query result into hive table as below, the result are  weirdly,all the value of fields are NULL ,except the row key.

 

 

#show hbase table data
hive> select * from default.hbase_user_data_snapshot1_table;
OK
user1	IL	Deb	D	IT	2000
user2	NULL	CH	AK	IT	80000
Time taken: 0.199 seconds, Fetched: 2 row(s)

#load the query result into new table
hive> create table test as select * from default.hbase_user_data_snapshot1_table;

.... OK

#but all the value of  fields are NULL ,expect the row key.
hive> select * from test;
OK
user1	NULL		NULL	NULL	NULL
user2	NULL		NULL	NULL	NULL

 

 

well i also try to load data without the param " set hive.hbase.snapshot.name ", now it insert query result into table as i expected. (but in this way ,it was very slow, because HBaseStorageHandler query the hbase with full scan with hbase server api )

CREATE EXTERNAL TABLE if not exists default.hbase_user_data_snapshot1_table1(key string, Location string,FName string,LName string, dept string,salary string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,personal_data:Location,personal_data:FName,personal_data:LName,professional_data:dept,professional_data:salary")
TBLPROPERTIES ("hbase.table.name" = "user_data_table");

 create table test1 as select * from default.hbase_user_data_snapshot1_table;

select * from test1;
OK
user1	IL	Deb	D	IT	2000
user2	NULL	CH	AK	IT	80000

the snapshot feature link:

https://issues.apache.org/jira/browse/HBASE-8369

 

Is there anybody can help me? 

 

New Contributor
Posts: 2
Registered: ‎03-24-2019

Re: Hive HBase snapshot query result can't insert into new table

i also tried to specify the hive.hbase.snapshot.name in TBLPROPERTIES,but the hive query request the Hbase region server API instead of access the underline file directly. I looked the HbaseStorageHandler source code, it choose whether or not use the HbaseSnapshot by judging the hive conf property (HiveConf.ConfVars.HIVE_HBASE_SNAPSHOT_NAME)

So if you specify the hive.hbase.snapshot.name in any other place ,it will not available for hive to query hbase with snapshot. And i also changed all the hdfs dir permission to 777,but it still couldn't insert data into hive table,except i didn't use the snapshot. It was weirdly I checked the DEBUG log ,but there is not any error in it.

I'm wandering is this a bug about that feature?