Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

​HBase row key cannot be NULL error in Hive-Hbase copy tables

avatar

Trying to copy a column data from one Hive-Hbase table to the second Hive-Hbase table.

Getting the HBase row key cannot be NULL error even though there is rowkey.

Way to reproduce this is :

HBase DDL:

CREATE TABLE 'TRIAL_SRC',{NAME => 'd'}

CREATE TABLE 'TRIAL_DEST',{NAME => 'd'}

Hive DDL:

Table 1 (Hive over Hbase):

create external table if not exists DCHANDRA.TRIAL_SRC (

key string

, pat_id string

) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,

d:ref_val

') tblproperties ('hbase.table.name' ='TRIAL_SRC');

Table 2 (Hive over Hbase):

create external table if not exists DCHANDRA.TRIAL_DEST (

key string

, pat_id string

) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,

d:ref_val

') tblproperties ('hbase.table.name' ='TRIAL_DEST');

DML for Table1:

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (1,101);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (2,102);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (3,103);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (4,104);

insert into DCHANDRA.TRIAL_SRC (key,pat_id) values (5,105);

DML for Table2:

insert into DCHANDRA.TRIAL_DEST (key,pat_id) values (1,10);

Upsert into Table2(TRIAL_DEST) from Table1(TRIAL_SRC) :

insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

Getting the error

Status: Failed

Vertex failed, vertexName=Map 1, vertexId=vertex_1463698008506_2409_9_01, diagnostics=[Task failed, taskId=task_1463698008506_2409_9_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)

at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)

at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}

at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)

at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)

at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)

... 14 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"1","pat_id":"101"}

at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:545)

at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)

... 17 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: HBase row key cannot be NULL

at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:426)

-Datta

1 ACCEPTED SOLUTION

avatar
New Member

Your insert statement did not contain the key. Should be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

,

When you ran:

insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

You did not include the key in your insert statement. Should it be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

View solution in original post

2 REPLIES 2

avatar
New Member

Your insert statement did not contain the key. Should be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

,

When you ran:

insert into dchandra.trial_dest(pat_id) select src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

You did not include the key in your insert statement. Should it be:

insert into dchandra.trial_dest(key, pat_id) select src.key, src.pat_id from dchandra.trial_src src join dchandra.trial_dest dest on src.key=dest.key;

avatar

@ashu Thanks. Resolved it.