- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive to HBase Data Migration Missing Data
- Labels:
-
Apache HBase
Created ‎12-31-2020 11:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have loaded data from Hive to HBase.
Hive source record has 3000 rows but after loading in Hbase . The HBase table has only 1200 records.
I'm not understanding the reason for it. Can anyone explain please.
CREATE TABLE events_Hbase(
src_util_id int,
event_log_id bigint,
event_id int,
event_text string,
partition_date date,
load_date date)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:event_log_id ,cf:event_id ,
cf:event_text ,
cf:partition_date ,
cf:load_date")
TBLPROPERTIES ("hbase.table.name" = "HbaseEvents");
INSERT INTO TABLE events_Hbase select * from Meterevents;
select count(*) from Meterevents --3000(hive source table)
select count(*) from events_Hbase ---1200( hbase table)
Can someone please explain .
Created ‎01-10-2021 10:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Madhureddy
Thanks for using Cloudera Community. Based on the post, Table "Meterevents" was loaded with 3K records & an Insert Select Operation was performed against "events_Hbase" from "Meterevents" table. The "events_Hbase" table is showing 1200 records.
We wish to check upon the following details:
1. Connect to HBase Shell & confirm the count of "HbaseEvents" table,
2. If the count of "HbaseEvents" table is 1200, Check for the Uniqueness of the 1st Column being used as ":key" while loading the Table. It's likely the RowKey is being repeated, causing an updated Version being utilised, thereby reducing the row-count.
3. Your team can check upon the above by creating 2 Tables & insert 10 unique rows (By RowKey Column) into 1 Table with 10 rows (Having, 5 Unique RowKey Values) into the 2nd Table. Next, Create 2 Hive Table using HBaseStorageHandler & perform the Insert Select SQL. Then, Check the Row Count.
- Smarak
