Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

​HBase row key cannot be NULL error in Hive-Hbase copy tables

avatar
Contributor

Trying to copy a column data from one Hive-Hbase table to the second Hive-Hbase table.

Getting the HBase row key cannot be NULL error even though there is rowkey.

Way to reproduce this is :

HBase DDL:

CREATE TABLE 'TRIAL_SRC',{NAME => 'd'}

CREATE TABLE 'TRIAL_DEST',{NAME => 'd'}

Hive DDL:

Table 1 (Hive over Hbase):

create external table if not exists DCHANDRA.TRIAL_SRC (

key string

, pat_id string

) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,

d:ref_val

') tblproperties ('hbase.table.name' ='TRIAL_SRC');

Table 2 (Hive over Hbase):

create external table if not exists DCHANDRA.TRIAL_DEST (

key string

, pat_id string

) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,

d:ref_val

') tblproperties ('hbase.table.name' ='TRIAL_DEST');

DML for Table1:

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (1,101);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (2,102);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (3,103);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (4,104);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (5,105);

DML for Table2:

insert into DCHANDRA.TRIAL_DEST (key,pat_id) values (1,10);

Upsert into Table2(TRIAL_DEST) from Table1(TRIAL_SRC) :

insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

Getting the error

Status: Failed

Vertex failed, vertexName=Map 1, vertexId=vertex_1463698008506_2409_9_01, diagnostics=[Task failed, taskId=task_1463698008506_2409_9_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)

at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)

at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}

at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)

at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)

at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)

... 14 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}

at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:545)

at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)

... 17 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: HBase row key cannot be NULL

at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:426)

-Datta

1 ACCEPTED SOLUTION

avatar
Explorer

Your insert statement did not contain the key. Should be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

,

When you ran:

insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

You did not include the key in your insert statement. Should it be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

View solution in original post

2 REPLIES 2

avatar
Explorer

Your insert statement did not contain the key. Should be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

,

When you ran:

insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

You did not include the key in your insert statement. Should it be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

avatar
Contributor

@ashu Thanks. Resolved it.