Created 06-03-2017 01:32 PM
I found a problem
The same sql at the same time 6 times, There will be half of the probability of failure
My personal task is related to the resource allocation of yarn,The task of waiting for no resources will be killed
The following is my configuration. Please help look down
yarn.scheduler.minimum-allocation-mb=3072M
tez.am.resource.memory.mb=3072
tez.task.resource.memory.mb=3072
hive.tez.container.size=3072
tez.container.max.java.heap.fraction=0.8
tez.am.grouping.split-wave=1.4
These are error logs
Vertex failed, vertexName=Map 1, vertexId=vertex_1496317022433_21566_2_05, diagnostics= Vertex vertex_1496317022433_21566_2_05 Map 1 killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: login_data initializer failed, vertex=vertex_1496317022433_21566_2_05 Map 1 java.lang.IllegalArgumentException: Illegal Capacity: -1 at java.util.ArrayList.(ArrayList.java:142) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Vertex failed, vertexName=Map 5, vertexId=vertex_1496317022433_21566_2_01, diagnostics= Vertex vertex_1496317022433_21566_2_01 Map 5 killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: applet_version_tbl initializer failed, vertex=vertex_1496317022433_21566_2_01 Map 5 java.lang.IllegalArgumentException: Illegal Capacity: -1 at java.util.ArrayList.(ArrayList.java:142) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Vertex failed, vertexName=Map 11, vertexId=vertex_1496317022433_21566_2_09, diagnostics= Vertex vertex_1496317022433_21566_2_09 Map 11 killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: dwb_biz_msg_user_opt_ds initializer failed, vertex=vertex_1496317022433_21566_2_09 Map 11 java.lang.IllegalArgumentException: Illegal Capacity: -1 at java.util.ArrayList.(ArrayList.java:142) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Vertex failed, vertexName=Map 13, vertexId=vertex_1496317022433_21566_2_06, diagnostics= Vertex vertex_1496317022433_21566_2_06 Map 13 killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: dim_oth_pub_date initializer failed, vertex=vertex_1496317022433_21566_2_06 Map 13 java.lang.IllegalArgumentException: Illegal Capacity: -1 at java.util.ArrayList.(ArrayList.java:142) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Vertex killed, vertexName=Reducer 8, vertexId=vertex_1496317022433_21566_2_03, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_03 Reducer 8 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Map 10, vertexId=vertex_1496317022433_21566_2_02, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_02 Map 10 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Reducer 9, vertexId=vertex_1496317022433_21566_2_04, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_04 Reducer 9 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Map 6, vertexId=vertex_1496317022433_21566_2_00, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_00 Map 6 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Reducer 2, vertexId=vertex_1496317022433_21566_2_11, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_11 Reducer 2 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Reducer 12, vertexId=vertex_1496317022433_21566_2_10, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_10 Reducer 12 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Reducer 4, vertexId=vertex_1496317022433_21566_2_13, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_13 Reducer 4 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Reducer 3, vertexId=vertex_1496317022433_21566_2_12, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_12 Reducer 3 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Map 14, vertexId=vertex_1496317022433_21566_2_07, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_07 Map 14 killed/failed due to:OTHER_VERTEX_FAILURE Vertex killed, vertexName=Reducer 15, vertexId=vertex_1496317022433_21566_2_08, diagnostics= Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_08 Reducer 15 killed/failed due to:OTHER_VERTEX_FAILURE DAG did not succeed due to VERTEX_FAILURE. failedVertices:4 killedVertices:10
You can see the failure of 4 times
yarn log
org.apache.tez.dag.app.dag.impl.AMUserCodeException: java.lang.IllegalArgumentException: Illegal Capacity: -1 at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallback.onFailure(RootInputInitializerManager.java:319) at com.google.common.util.concurrent.Futures$6.run(Futures.java:977) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253) at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149) at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134) at com.google.common.util.concurrent.ListenableFutureTask.done(ListenableFutureTask.java:86) at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:380) at java.util.concurrent.FutureTask.setException(FutureTask.java:247) at java.util.concurrent.FutureTask.run(FutureTask.java:267) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
yarn Log Aggregation Status TIME_OUT
I observed the phenomenon of multiple tasks to run the resources used, the back of the task without resources, has been waiting for resources, and then was killed
There is a problem here is no resources, the task should be queued, rather than can get to the resources
So, is i somewhere in the wrong configuration?
Please see if I need to provide additional information
Created 06-05-2017 07:37 AM
Resource Queue Configuration Error
Fair queue configuration error
I found that the maximum value of the queue is related to the minimum resource unit
Created 06-05-2017 07:40 AM
I have solved this problem