Created 03-23-2017 05:45 PM
When I try to insert data from a table into a partitioned bucketed table, I am getting this error:
Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1490155524314_0037_1_00, diagnostics=[Task failed, taskId=task_1490155524314_0037_1_00_000007, diagnostics=[TaskAttempt 0 failed, info=[attempt_1490155524314_0037_1_00_000007_0 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 1 failed, info=[attempt_1490155524314_0037_1_00_000007_1 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 2 failed, info=[attempt_1490155524314_0037_1_00_000007_2 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 3 failed, info=[attempt_1490155524314_0037_1_00_000007_3 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:14, Vertex vertex_1490155524314_0037_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE] Vertex killed, vertexName=Reducer 2, vertexId=vertex_1490155524314_0037_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:8, Vertex vertex_1490155524314_0037_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE] DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1490155524314_0037_1_00, diagnostics=[Task failed, taskId=task_1490155524314_0037_1_00_000007, diagnostics=[TaskAttempt 0 failed, info=[attempt_1490155524314_0037_1_00_000007_0 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 1 failed, info=[attempt_1490155524314_0037_1_00_000007_1 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 2 failed, info=[attempt_1490155524314_0037_1_00_000007_2 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 3 failed, info=[attempt_1490155524314_0037_1_00_000007_3 being failed for too many output errors. failureFraction=0.125, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:14, Vertex vertex_1490155524314_0037_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1490155524314_0037_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:8, Vertex vertex_1490155524314_0037_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
Created 04-01-2018 05:27 PM
Hi
Were you able to find the solution? I am facing the same issue.
When i run the query from hive shell or zeppelin as a hive user, then the query works fine.
But if i run it with other user, sometime it works and most of the times i get the Vertex error.
Thanks & Regards
Created 12-27-2018 10:42 AM
Hi @kerra I am also having the issue when I get vertex failed error when I tried to insert into table with 'Tez' engine in hive. Could you please mention what is the user permission that needs to be set.
I am running hive with hdfs user.
Thank you.
Created 05-15-2019 04:07 PM
hi guys, i am having same problem, but when i ran a query (select count(*) from table_name) where table is small it runs successfully, but when table is big i get this error, i checked yarn logs and its seems that the problem occurs while data shuffling, so i traced the problem to the node which received the task , i found this error in (/var/log/hadoop-yarn/yarn/hadoop-yarn-nodemanager-myhost.com.log )
/var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557491114054_0010/output/attempt_1557491114054_0010_1_03_000000_1_10002/file.out not found
although for other attempts for same application, this file exists normally,
and in the yarn application log, after exiting the beeline session , this error appears
2019-05-14 16:19:58,442 [WARN] [Fetcher_B {Map_1} #0] |shuffle.Fetcher|: copyInputs failed for tasks [InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1]] 2019-05-14 16:19:58,442 [INFO] [Fetcher_B {Map_1} #0] |impl.ShuffleManager|: Map_1: Fetch failed for src: InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1]InputIdentifier: InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1], connectFailed: false 2019-05-14 16:19:58,443 [INFO] [Fetcher_B {Map_1} #1] |HttpConnection.url|: for url=http://myhost_name:13562/mapOutput?job=job_1557754551780_0155&dag=5&reduce=0&map=attempt_1557754551780_0155_5_00_000000_0_10003 sent hash and receievd reply 0 ms 2019-05-14 16:19:58,443 [INFO] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to read data to memory for InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1]. len=28, decomp=14. ExceptionMessage=Not a valid ifile header 2019-05-14 16:19:58,443 [WARN] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to shuffle output of InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1] from myhost_name java.io.IOException: Not a valid ifile header at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.verifyHeaderMagic(IFile.java:859) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.isCompressedFlagEnabled(IFile.java:866) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:616) at org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:121) at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:950) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:599) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:486) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:284) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
i am using HDP 3.1
so any suggestions what might be the error ?
Thanks
Created 05-21-2019 03:44 AM
I'm having same issue with HDP3.1 (Tez 0.9.1).
I can reproduce it with:
1) create two files - file1.csv and file2.csv 2) add two fields to the csv files as below one,two one,two one,two 3) create external table use testdb; create external table test1(s1 string, s2 string) row format delimited fields terminated by ',' stored as textfile location '/user/usera/test1'; 4) Copy one csv file to hdfs - /user/usera/test1 hdfs dfs -put ./file1.csv /user/usera/test1/ 5) select count(*) from testdb.test1; => works fine. 6) copy the second csv file to HDFS hdfs dfs -put ./file2.csv /user/usera/test1/ 7) select * from testdb.test1; => Can see the data in both hdfs files. 8) select count(*) form testdb.test1; => Get this problem.
And we can see following error in the mapper task's log.
2019-05-17 10:08:10,317 [INFO] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to read data to memory for InputAttemptIdentifier [inputIdentifier=1, attemptNumber=0, pathComponent=attempt_1557383221332_0289_1_00_000001_0_10003, spillType=0, spillId=-1]. len=25, decomp=11. ExceptionMessage=Not a valid ifile header 2019-05-17 10:08:10,317 [WARN] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to shuffle output of InputAttemptIdentifier [inputIdentifier=1, attemptNumber=0, pathComponent=attempt_1557383221332_0289_1_00_000001_0_10003, spillType=0, spillId=-1] from XXXXX java.io.IOException: Not a valid ifile header at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.verifyHeaderMagic(IFile.java:859) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.isCompressedFlagEnabled(IFile.java:866) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:616) at org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:121) at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:950) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:599) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:486) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:284) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) |
I think, it's similar to https://issues.apache.org/jira/browse/TEZ-3699
I've confirmed the patch already applied to tez with HDP 3.1.
So I guess, it's new bug with Tez 0.9.x
(I confirmed there is no problem with HDP2.6/Tez 0.7.0).
Any idea?
Created 05-22-2019 04:27 AM
Looks, it's tez issue comes from "fs.permissions.umask-mode" setting.