Support Questions

sujoy_neogi · ‎07-12-2019

I have some Hive inserts which are working absolutely fine on HDP 2.6.5 and Redhat 6.8. But the same inserts fail on HDP 2.6.5 and Redhat 7.6/7.5

To reproduce it, create a table (preferably ACID table), now after inserting 100+ rows (sometimes 129 rows, sometimes 130 rows), it starts failing with "could only be replicated to 0 nodes" error.

Surprisingly if i start the same inserts again, after some time, it again fails after same 100 odd inserts. I tried tuning Ambari properties for Hive/Tez/YARN using HDP guidelines but that did not help much. The data node replication error could be mis-leading and when i tried inserting into non-transaction tables, i got some other errors also. All those backtraces are pasted below:

1) Backtrace 1: could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.

INFO : Dag name: INSERT INTO analyticsdb....15:17:50.417',46)(Stage-1)

ERROR : Failed to execute tez graph.

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hive/hive/_tez_session_dir/e891a1be-8544-4bb4-a9f6-4a83a18094e2/.tez/application_1562919643063_0016/tez-conf.pb could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

2) Hive Runtime Error while closing operators

INFO : Dag name: INSERT INTO analyticsdb....15:17:50.417',46)(Stage-1)

INFO : Status: Running (Executing on YARN cluster with App id application_1562919643063_0011)

--------------------------------------------------------------------------------

VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED

--------------------------------------------------------------------------------

Map 1 RUNNING 1 0 0 1 4 0

--------------------------------------------------------------------------------

VERTICES: 00/01 [>>--------------------------] 0% ELAPSED TIME: 11.19 s

--------------------------------------------------------------------------------

ERROR : Status: Failed

ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1562919643063_0011_200_00, diagnostics=[Task failed, taskId=task_1562919643063_0011_200_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)

at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)

at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)

at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators

at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:370)

at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)

... 14 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hive/warehouse/analyticsdb.db/tbl_accessareareaders_cdc/.hive-staging_hive_2019-07-12_09-57-54_555_2284217577217095088-3/_task_tmp.-ext-10000/_tmp.000000_0 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

3) Failed with exception Buffer underflow.

INFO : Dag name: INSERT INTO analyticsdb....15:17:50.417',46)(Stage-1)

INFO : Loading data to table analyticsdb.tbl_accessareareaders_cdc from hdfs://ip-172-31-17-232.ec2.internal:8020/apps/hive/warehouse/analyticsdb.db/tbl_accessareareaders_cdc/.hive-staging_hive_2019-07-12_09-27-38_852_1168222145805851570-3/-ext-10000

INFO : [Warning] could not update stats.Failed with exception Buffer underflow.

org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow.

at org.apache.hive.com.esotericsoftware.kryo.io.Input.require(Input.java:181)

at org.apache.hive.com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:355)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:809)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670)

at org.apache.hadoop.hive.ql.stats.fs.FSStatsAggregator.connect(FSStatsAggregator.java:66)

at org.apache.hadoop.hive.ql.exec.StatsTask.createStatsAggregator(StatsTask.java:318)

at org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:149)

at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:122)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)

at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1765)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1506)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1303)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1165)

at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)

at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)

at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)

at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

sujoy_neogi · ‎07-16-2019

I see lot of "

failed to rename /tmp/test/stage_hive_2019-07-15_11-24-45_060_5725383949242199490-280/-ext-10000/000000_0 to /apps/hive/warehouse/analyticsdb.db/test_p/000000_0_copy_135 because destination exists

" errors in name node log.

It looks like https://issues.apache.org/jira/browse/HIVE-13850 , but i am already using ACID tables so not sure why same file name is getting generated in loops.

sujoy_neogi · ‎07-16-2019

There is a workaround i found to this problem, if i have 500 rows, i have to execute them in 5 beeline/jdbc/hive cli sessions, so first session inserts 120 rows then another new session/new jdbc connection takes over and so on. I am not sure why this is happening at first place, where existing hive session is not able to see the hdfs file/directory names generated by inserts in that session and generates the exactly same file as for previous insert.

Cloudera Community

Support Questions

Hive Table Inserts are failing after certain number of inserts