- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark shuffle is failing with connection exception when dynamic allocation is enabled
- Labels:
-
Apache Hadoop
-
Apache Spark
Created 03-27-2018 10:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
When dynamic allocation is enabled, most of the times we are facing issues while fetching the blocks
RetryingBlockFetcher: Retrying fetch (1/3) for 1 outstanding blocks after 5000ms
Error RetyringBlockFetcher: Exception while beginning fetch of 1 outstanding blocks (after 1 retries)
java.io.IOException: Failed to connect to <host>:<some port>
Caused by java.net.ConnectException: Connection refused: <host>:<some port>
We are seeing these errors continuously in the executors when we run a big spark jobs. During this time nothing is being processed and after some time these errors are getting disappeared and the processing gets resumed. This is impacting our job SLAs. Can any one help me on this
Created 04-18-2021 02:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Cloudera,
Can someone please help with this issue ?
I'm also facing this issue in our production and impacting our SLA.
Created 04-18-2021 11:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello ,
Can you check and increase the below parameters?
--conf spark.executor.memory=XXX increasing number of executors
Also, See below doc for tuning your spark jobs.
https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/
Created 04-23-2021 07:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try to run the command adding " --deploy-mode cluster "
It should work, this seems to be a bug
https://support.oracle.com/knowledge/Oracle%20Database%20Products/2498643_1.html
