Created 06-22-2017 10:41 PM
When I submit a spark job to the cluster it failed and gives me the following error in the log file:
Caused by: java.io.IOException: Failed to connect to /0.0.0.0:35994 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:194) at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:190) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Which I guess means it failed to connect to the driver. I tried to increase "spark.yarn.executor.memoryOverhead" parameter but it doesn't work.
This is the submit command I use:
/bin/spark-submit --class example.Hello --jars ... --master yarn --deploy-mode cluster --supervise --conf spark.yarn.driver.memoryOverhead=1024 ...(jar file path)
I am using HDP-2.6.1.0 and spark 2.1.1
Created 06-23-2017 04:02 AM
Hi Tariq, can you confirm that the cluster is running. Also what happens when you attempt to run in local mode? (by changing the --master and --deploy-mode parameters).
Created 06-23-2017 03:21 PM
Thank you for responding Mark. Ambari does not show any alerts regarding spark. Is there any other way to make sure the cluster is running. Also I run it in local mode like this:
/bin/spark-submit --class example.Hello --jars ... --master local --supervise --conf spark.yarn.driver.memoryOverhead=1024
and it ran without any problems.
Created 06-23-2017 04:08 PM
In Ambari dashboard, Spark or Spark2 should show on the list of services installed on the left. Click on these links to see status of servers. If there are no links to Spark or Spark2, then it may not be installed. Click on "add services" link on the bottom left to see if Spark and/or Spark2 is selectable for install.
Created on 06-23-2017 10:56 PM - edited 08-17-2019 06:44 PM
Yes. I checked that in Ambari and Spark is installed. Here is a screen shot:
Created 06-23-2017 07:26 PM
Hi Tariq, Please check if any hardware firewall / software firewall(iptables) is present in b/w client node and worker node . you can test the connectivity by below.
On server end(where driver runs):-
nc -l 35994
On client end(where worker runs):-
nc -vz <server-ip> 35994
Created 06-23-2017 11:04 PM
I am not able to do that since the driver port change randomly on each job submit. Can I fix the port value?
Also, since the client is connecting to the driver on the local API I don't think a firewall is the problem. right?
Created 06-24-2017 03:26 AM
Set SPARK_MASTER_PORT = 35994 in spark-env.sh , and restart the spark . If you are not able to pass the port test with sample port , then it is the filrewall issue. what is the o/p of the test.
On server end(where driver runs):- nc -l 35994 On client end(where worker runs):- nc -vz <server-ip> 35994
Created 06-25-2017 06:26 PM
thanks for your reply Kalai.
In my submit command I am using spark in yarn mode (--master yarn) not stand alone mode so I do not think it will use this configuration.
Also as far as I understand this sets the master node in stand alone mode and has nothing to do with the driver port.
Anyway to confirm that I tried to make the changes you metioned and it still ran on random ports.
Created 06-30-2017 11:05 AM
Can you check if the firewall is blocking the ports?