Support Questions

Find answers, ask questions, and share your expertise

failing to connect to spark driver when submitting job to spark cluster

avatar
Explorer

When I submit a spark job to the cluster it failed and gives me the following error in the log file:

Caused by: java.io.IOException: Failed to connect to /0.0.0.0:35994 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:194) at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:190) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Which I guess means it failed to connect to the driver. I tried to increase "spark.yarn.executor.memoryOverhead" parameter but it doesn't work.

This is the submit command I use:

/bin/spark-submit --class example.Hello --jars ... --master yarn --deploy-mode cluster --supervise --conf spark.yarn.driver.memoryOverhead=1024 ...(jar file path)

I am using HDP-2.6.1.0 and spark 2.1.1

10 REPLIES 10

avatar
Super Collaborator

hi @tariq abughofa,

could you please SELinux Disabled or not on the driver, which looks preventing new dynamic ports refuse to connect.