Member since
08-25-2017
2
Posts
0
Kudos Received
0
Solutions
05-15-2019
02:59 PM
hi, i am having the same problem after upgrading from HDP 2.6.2 to HDp 3.1, although i have alot of resources in the cluster, when i ran a query (select count(*) from table) if the table is small (3k records) it ran successfully, if the table is larger (50k) records i am getting the same vertex failure error, i checked the yarn application log for the failed query i get the below error 2019-05-14 11:58:14,823 [INFO] [TezChild] |tez.ReduceRecordProcessor|: Starting Output: out_Reducer 2
2019-05-14 11:58:14,828 [INFO] [TezChild] |compress.CodecPool|: Got brand-new decompressor [.snappy]
2019-05-14 11:58:18,466 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Routing events from heartbeat response to task, currentTaskAttemptId=attempt_1557754551780_0137_1_01_000000_0, eventCount=1 fromEventId=1 nextFromEventId=2
2019-05-14 11:58:18,488 [INFO] [Fetcher_B {Map_1} #1] |HttpConnection.url|: for url=http://myhost_name.com:13562/mapOutput?job=job_1557754551780_0137&dag=1&reduce=0&map=attempt_1557754551780_0137_1_00_000000_0_10002 sent hash and receievd reply 0 ms
2019-05-14 11:58:18,491 [INFO] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to read data to memory for InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0137_1_00_000000_0_10002, spillType=0, spillId=-1]. len=28, decomp=14. ExceptionMessage=Not a valid ifile header
2019-05-14 11:58:18,492 [WARN] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to shuffle output of InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0137_1_00_000000_0_10002, spillType=0, spillId=-1] from myhost_name.com
java.io.IOException: Not a valid ifile header
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.verifyHeaderMagic(IFile.java:859)
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.isCompressedFlagEnabled(IFile.java:866)
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:616)
at org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:121)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:950)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:599)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:486)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:284)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) both queries were working fine before upgrade, the only change i made after the upgrade is increasing the heapsize of the data nodes, i also followed @Geoffrey Shelton Okot configuration but still same error. thanks
... View more