Support Questions

Find answers, ask questions, and share your expertise

Hive Table Inserts are failing after certain number of inserts

avatar
Explorer

I have some Hive inserts which are working absolutely fine on HDP 2.6.5 and Redhat 6.8. But the same inserts fail on HDP 2.6.5 and Redhat 7.6/7.5

To reproduce it, create a table (preferably ACID table), now after inserting 100+ rows (sometimes 129 rows, sometimes 130 rows), it starts failing with "could only be replicated to 0 nodes" error.

Surprisingly if i start the same inserts again, after some time, it again fails after same 100 odd inserts. I tried tuning Ambari properties for Hive/Tez/YARN using HDP guidelines but that did not help much. The data node replication error could be mis-leading and when i tried inserting into non-transaction tables, i got some other errors also. All those backtraces are pasted below:

1) Backtrace 1: could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.

INFO : Dag name: INSERT INTO analyticsdb....15:17:50.417',46)(Stage-1)

ERROR : Failed to execute tez graph.

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hive/hive/_tez_session_dir/e891a1be-8544-4bb4-a9f6-4a83a18094e2/.tez/application_1562919643063_0016/tez-conf.pb could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)


2) Hive Runtime Error while closing operators

INFO : Dag name: INSERT INTO analyticsdb....15:17:50.417',46)(Stage-1)

INFO : Status: Running (Executing on YARN cluster with App id application_1562919643063_0011)


--------------------------------------------------------------------------------

VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED

--------------------------------------------------------------------------------

Map 1 RUNNING 1 0 0 1 4 0

--------------------------------------------------------------------------------

VERTICES: 00/01 [>>--------------------------] 0% ELAPSED TIME: 11.19 s

--------------------------------------------------------------------------------

ERROR : Status: Failed

ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1562919643063_0011_200_00, diagnostics=[Task failed, taskId=task_1562919643063_0011_200_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)

at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)

at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators

at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:370)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)

... 14 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hive/warehouse/analyticsdb.db/tbl_accessareareaders_cdc/.hive-staging_hive_2019-07-12_09-57-54_555_2284217577217095088-3/_task_tmp.-ext-10000/_tmp.000000_0 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)


3) Failed with exception Buffer underflow.

INFO : Dag name: INSERT INTO analyticsdb....15:17:50.417',46)(Stage-1)

INFO : Loading data to table analyticsdb.tbl_accessareareaders_cdc from hdfs://ip-172-31-17-232.ec2.internal:8020/apps/hive/warehouse/analyticsdb.db/tbl_accessareareaders_cdc/.hive-staging_hive_2019-07-12_09-27-38_852_1168222145805851570-3/-ext-10000

INFO : [Warning] could not update stats.Failed with exception Buffer underflow.

org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow.

at org.apache.hive.com.esotericsoftware.kryo.io.Input.require(Input.java:181)

at org.apache.hive.com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:355)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:809)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670)

at org.apache.hadoop.hive.ql.stats.fs.FSStatsAggregator.connect(FSStatsAggregator.java:66)

at org.apache.hadoop.hive.ql.exec.StatsTask.createStatsAggregator(StatsTask.java:318)

at org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:149)

at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:122)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)

at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1765)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1506)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1303)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1165)

at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)

at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)

at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)

at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)


2 REPLIES 2

avatar
Explorer

I see lot of "

failed to rename /tmp/test/stage_hive_2019-07-15_11-24-45_060_5725383949242199490-280/-ext-10000/000000_0 to /apps/hive/warehouse/analyticsdb.db/test_p/000000_0_copy_135 because destination exists

" errors in name node log.

It looks like https://issues.apache.org/jira/browse/HIVE-13850 , but i am already using ACID tables so not sure why same file name is getting generated in loops.


avatar
Explorer

There is a workaround i found to this problem, if i have 500 rows, i have to execute them in 5 beeline/jdbc/hive cli sessions, so first session inserts 120 rows then another new session/new jdbc connection takes over and so on. I am not sure why this is happening at first place, where existing hive session is not able to see the hdfs file/directory names generated by inserts in that session and generates the exactly same file as for previous insert.