Created 05-23-2016 08:25 PM
Trying to copy a column data from one Hive-Hbase table to the second Hive-Hbase table.
Getting the HBase row key cannot be NULL error even though there is rowkey.
Way to reproduce this is :
HBase DDL:
CREATE TABLE 'TRIAL_SRC',{NAME => 'd'}
CREATE TABLE 'TRIAL_DEST',{NAME => 'd'}
Hive DDL:
Table 1 (Hive over Hbase):
create external table if not exists DCHANDRA.TRIAL_SRC (
key string
, pat_id string
) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,
d:ref_val
') tblproperties ('hbase.table.name' ='TRIAL_SRC');
Table 2 (Hive over Hbase):
create external table if not exists DCHANDRA.TRIAL_DEST (
key string
, pat_id string
) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,
d:ref_val
') tblproperties ('hbase.table.name' ='TRIAL_DEST');
DML for Table1:
insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (1,101);
insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (2,102);
insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (3,103);
insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (4,104);
insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (5,105);
DML for Table2:
insert into DCHANDRA.TRIAL_DEST (key,pat_id) values (1,10);
Upsert into Table2(TRIAL_DEST) from Table1(TRIAL_SRC) :
insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
Getting the error
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1463698008506_2409_9_01, diagnostics=[Task failed, taskId=task_1463698008506_2409_9_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:545)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
... 17 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: HBase row key cannot be NULL
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:426)
-Datta
Created 05-23-2016 08:38 PM
Your insert statement did not contain the key. Should be:
insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
,When you ran:
insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
You did not include the key in your insert statement. Should it be:
insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
Created 05-23-2016 08:38 PM
Your insert statement did not contain the key. Should be:
insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
,When you ran:
insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
You did not include the key in your insert statement. Should it be:
insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;
Created 05-23-2016 08:44 PM
@ashu Thanks. Resolved it.