Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Insert into HBase via Hive - Error for medium sized data

Insert into HBase via Hive - Error for medium sized data

New Contributor

Hello guys,

I created a managed hive table, which stores data in HBase.

The hive table is created like that (simplified):

CREATE TABLE IF NOT EXISTS hivetable
(key string, value string)
STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES
('hbase.columns.mapping' =
':key,FAMILIY:VALUE')
TBLPROPERTIES
('hbase.table.name' = 'hbasetable', 'hbase.mapred.output.outputtable'
= 'hbasetable');

As you can see 'hbase.mapred.output.outputtable' is set.

Now I want to insert data from another hive table in this table:

INSERT INTO TABLE hivetable
SELECT key, value FROM anotherhivetable;

This INSERT works fine for small datasizes to import (500MB works).

But for a little bit larger data size (i.e. 2GB) it fails with this error:

java.sql.SQLException:
Error while processing statement: FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1,
vertexId=vertex_1507297731037_3840_1_00, diagnostics=[Task failed, taskId=task_1507297731037_3840_1_00_000018,
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator
initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at
java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException:
Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.IllegalArgumentException: Must specify table name at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1150)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:350)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:363) at
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482) at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482) at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:489)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:231)
... 15 more Caused by: java.lang.IllegalArgumentException: Must specify table
name at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:195)
at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
at
org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:277)
at
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:267)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1148)
... 25 more 

I think the interesting part is this:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat

As said before 'hbase.mapred.output.outputtable' is set in the TBLPROPERTIES.

WIth Google i found this Bug in the Hive JIRA:

https://issues.apache.org/jira/browse/HIVE-15103,

which seems to be the same issue.

Furthermore Vsevolod Ostapenko commented in this Bug:

The same issue can be observed on Hive 1.2 with Tez in HDP 2.5. Property hbase.mapred.output.outputtable is set to the name of the destination HBase table.
Hive with MR execution engine works as expected.

Hive 1.2 with Tez in HDP 2.5 is exact the same setting as on my Cluster.

But MR don't seem to solve my Problem.

The guys from my platform assume it is related to too less ressources (memory) and that is the cause for the problem/crash, but the error message seems not be related to a ressource problem and 2GB of data to insert don't seem to be much.

Does anybody has a solution for this problem or an idea why it fails?

1 REPLY 1
Highlighted

Re: Insert into HBase via Hive - Error for medium sized data

New Contributor

Hi guys, i have the same issue. do you find any solution for this?

,

Hi Guys, I have the same issue. Have you find any solution for this?

Don't have an account?
Coming from Hortonworks? Activate your account here