- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Even after configuring the initial blockManager.driver.port and blockManager.port, and setting maxRetries to 300, the DataNode still attempts to use random ports. Why does this happen?
- Labels:
-
Apache Spark
-
Apache YARN
Created 02-05-2025 09:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I configured as blockManager.driver.port 21750 and blockManager.port as 21700 , maxRetries as 300
25/02/05 17:16:08 WARN TransportChannelHandler: Exception in connection from /172.21.2X0.XXX:59698
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:254)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
Created 02-05-2025 09:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@thoufeeq1218, Welcome to our community! To help you get the best possible answer, I have tagged in our Spark experts @Babasaheb @ggangadharan who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created on 02-06-2025 12:37 AM - edited 02-06-2025 12:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @thoufeeq1218,
We understand that you have configured spark.blockManager.driver.port and spark.blockManager.port, but Spark may still attempt to use random ports due to the following reasons:
- Why is Spark using random ports?
Spark uses additional ports beyond BlockManager for communication between the driver and executors.
The random port seen (59698) is from the ephemeral port range (1024–65535) and could be assigned due to:
spark.driver.port (default: random)
spark.executor.port (default: random)
- How to restrict Spark to specific ports?
Explicitly set spark.driver.port to ensure the driver listens on a fixed port:
--conf spark.driver.port=21800
Ensure spark.blockManager.port is available
If 21700 is occupied, Spark will fall back to a random port.
- Understanding spark.port.maxRetries
If spark.port.maxRetries > 0 (default: 16), Spark will try additional ports within the ephemeral range.
If spark.port.maxRetries = 0, Spark will fail immediately if the specified port is unavailable.
- Executors and Dynamic Ports
Executors start/stop dynamically and may request random ports.
If you must prevent Spark from using random ephemeral ports, use the following settings:
--conf spark.driver.port=21800
--conf spark.blockManager.driver.port=21750
--conf spark.blockManager.port=21700
--conf spark.executor.port=21810
--conf spark.port.maxRetries=0
These settings can be applied at the job level or in the Spark configuration file.
Note: If a port is already used by a running job, a new job may fail due to a port conflict.
If you found this response assisted with your query, please take a moment to log in and click on KUDOS 🙂 & ”Accept as Solution" below this post.
Thank you.
Created on 02-07-2025 03:34 AM - edited 02-07-2025 03:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Babasaheb ,
Thanks for your time, I tried your suggestion, but I'm still facing the same issue. Here’s more information: the error occurs exactly after 2 hours and ~15 minutes of runtime every time. The YARN scheduler exits with an error.
25/02/07 14:55:32 WARN TransportChannelHandler: Exception in connection from /1X1.XX.2XX.1XX:58360
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:254)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
25/02/07 14:55:32 WARN TransportChannelHandler: Exception in connection from /1X1.XX.2XX.1XX:58388
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:254)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
25/02/07 14:55:32 WARN TransportChannelHandler: Exception in connection from /1X1.XX.2XX.1XX:58382
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:254)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
25/02/07 14:55:32 WARN TransportChannelHandler: Exception in connection from /1X1.XX.2XX.1XX:40712
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
25/02/07 14:55:33 ERROR TransportClient: Failed to send RPC RPC 4871454967371228059 to /1X1.XX.2XX.1XX:58360: io.netty.channel.StacklessClosedChannelException
io.netty.channel.StacklessClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, ChannelPromise)(Unknown Source)
25/02/07 14:55:33 ERROR YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(Map(),Map(),Map(),Set()) to AM was unsuccessful
java.io.IOException: Failed to send RPC RPC 4871454967371228059 to /1X1.XX.2XX.1XX:58360: io.netty.channel.StacklessClosedChannelException
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:395)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:372)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:999)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:860)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1367)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:877)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
at io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
Caused by: io.netty.channel.StacklessClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, ChannelPromise)(Unknown Source)
25/02/07 14:55:33 ERROR Utils: Uncaught exception in thread YARN application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:847)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:114)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:178)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$stop$2(TaskSchedulerImpl.scala:992)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1375)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:992)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$stop$4(DAGScheduler.scala:2976)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1375)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:2976)
at org.apache.spark.SparkContext.$anonfun$stop$12(SparkContext.scala:2263)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1375)
at org.apache.spark.SparkContext.stop(SparkContext.scala:2263)
at org.apache.spark.SparkContext.stop(SparkContext.scala:2216)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:125)
Caused by: java.io.IOException: Failed to send RPC RPC 4871454967371228059 to /1X2.XX.2X5.XX7:58360: io.netty.channel.StacklessClosedChannelException
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:395)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:372)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:999)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:860)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1367)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:877)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
at java.lang.Thread.run(Thread.java:750)
Caused by: io.netty.channel.StacklessClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, ChannelPromise)(Unknown Source)
25/02/07 14:55:33 INFO CheckpointFileManager: Renamed temp file hdfs://ciXXXXf01:9000/user/cluster_pred/c458cbd3-8c00-4f10-8cce-c33377b74582/offsets/.2036.5b6dfa8e-882c-4ff1-9ae3-1c114319c719.tmp to hdfs://cichf1:9000/user/cluster_pred/c458cbd3-8c00-4f10-8cce-c33377b74582/offsets/2036
Created on 02-10-2025 08:32 PM - edited 02-11-2025 01:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone Could anyone in the community help identify the root cause in spark stream?
@thoufeeq1218 wrote:ERROR YarnClientSchedulerBackend: YARN application has exited unexpectedly with state SUCCEEDED! Check the YARN application logs for more details.
