Member since
12-21-2020
91
Posts
8
Kudos Received
13
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3202 | 08-12-2021 05:16 AM | |
| 3539 | 06-29-2021 06:21 AM | |
| 3769 | 06-16-2021 07:15 AM | |
| 3368 | 06-14-2021 12:08 AM | |
| 9751 | 05-14-2021 06:03 AM |
04-29-2021
09:08 AM
I also had the exact same error. I'm sure it is indicating that something is wrong with the Fair Scheduler config. If this setup is currently under test, can you check whether you're able to get it working with Capacity Scheduler? This will help us isolate that the issue is indeed related to YARN scheduler. Thanks, Megh
... View more
04-29-2021
05:41 AM
Hi, Can you share the impalad.FATAL error log from one of your impala Daemons? Thanks, Megh
... View more
04-28-2021
10:54 PM
Hi @dmharshit , Can you share the log snippets from Hive Metastore, Hiveserver2 and YARN ResourceManager from the timeframe of execution of this query? Thanks, Megh
... View more
04-28-2021
10:30 PM
Hi, I had got this same error when I changed my YARN scheduler from Capacity Scheduler to Fair Scheduler. I reverted to capacity scheduler and it went away. Is it the case with you as well? Thanks, Megh
... View more
04-27-2021
07:34 AM
1 Kudo
I think the repo itself is broken. navigating to http://repos.bigtop.apache.org/releases/1.5.0/centos/7/$basearch, takes to this: <Error>
<Code>NoSuchKey</Code>
<Message>The specified key does not exist.</Message>
<Key>releases/1.5.0/centos/7/$basearch</Key>
<RequestId>K3BYRJWH03RKGFVK</RequestId>
<HostId>l+++bHT4W/dNczJrC5KWH7vPQ4dydVA6kLD69001OO8XQKpj6ziHZxLIsVCEIBsoZZT7o+1ILEQ=</HostId>
</Error> Can you point to the documentation you're referring for this installation? Thanks, Megh
... View more
04-27-2021
12:58 AM
1 Kudo
Hi @ryu , It seems that your repo is not correctly configured. Can you paste the output of "yum list webhcat-tar-hive*"? Also, share the contents of /etc/yum.repos.d/<your repo file>.repo Thanks, Megh
... View more
04-24-2021
02:53 AM
1 Kudo
Came across this bug and followed the steps given in this comment. Haven't faced this issue after that. Thanks, Megh
... View more
04-22-2021
06:10 AM
Hello Everyone, I have an external table (Text with Gzip compression) partitioned on a date column. I have created another external table with the same structure but with Parquet format with Snappy compression. I want to copy all the data from the text table to parquet table, and I'm using the following query for testing it out for one day. insert into parquet_table partition(asdt) select * from text_table where partition_col='somedate'; The partition contains almost 100M rows. This query runs for almost half an hour and then results into this error: ERROR : Vertex failed, vertexName=Reducer 2, vertexId=vertex_1619005705788_0004_2_01, diagnostics=[Task failed, taskId=task_1619005705788_0004_2_01_000240, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in Fetcher_O {Map_1} #1
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:306)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:288)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch failures and insufficient progress!failureCounts=20, pendingInputs=20, fetcherHealthy=false, reducerProgressedEnough=true, reducerStalled=true
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:1053)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:794)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:318)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:182)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:194)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:57)
... 7 more
, errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in Fetcher_O {Map_1} #1
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:306)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:288)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch failures and insufficient progress!failureCounts=20, pendingInputs=20, fetcherHealthy=false, reducerProgressedEnough=true, reducerStalled=true
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:1053)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:794)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:318)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:182)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:194)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:57)
... 7 more
], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in Fetcher_O {Map_1} #7
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:306)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:288)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch failures and insufficient progress!failureCounts=24, pendingInputs=24, fetcherHealthy=false, reducerProgressedEnough=true, reducerStalled=true
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:1053)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:794)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:318)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:182)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:194)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:57)
... 7 more
, errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in Fetcher_O {Map_1} #7
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:306)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:288)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) I'm running this in hive on tez in my CDP 7.1.5 I've disabled the default transactional property and one time insert only properties in Hive on Tez config. I think I'm hitting some bug but unable to find any useful info in the hiveserver/Resourcemanager/metastore/nodemanager logs. Has anybody else encountered this? Thanks, Megh
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
-
Cloudera
04-22-2021
03:22 AM
Understood. In our case, the culprit turned out to be 50075 port in firewall. I've added as an answer. In any case, thanks for your support. Thanks, Megh
... View more
04-22-2021
01:19 AM
This Issue was resolved after we opened 50075 port (Datanode WebHDFS port) between the source and destination cluster. I think the error org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: org.apache.hadoop.ipc.StandbyException: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error is rather misleading. Once I tried to distcp with the active namenode instead of the nameservice, I was able to get the root cause saying "connection timed out" on port 50075. In the official documentation of Cloudera this port is not mentioned as a requirement. Thanks, Megh
... View more