Created on 10-19-2017 02:36 PM - edited 08-17-2019 05:42 PM
lot of jobs are failing with the error unable to connect to driver
Thanks Thangarajan but giving ip and port will make the job specific to only that particular
so again if those ports are not available it could be a problem is there any other work around
thanks for your rply
Is this happening with all the jobs? For example even with some long running jobs?
Sometimes it can happen when the Spark job finishes fine but too early, But the executors are still trying to contact driver, Hence ultimately yarn declares the job as failed since executors could not connect.
It can also happen if there is a Firewall (Network issue) if some ports are blocked. So you might want to check if the ports are accessible properly or not? Mostly the port are chosen at random in spark, but you may try setting spark.driver.port to see if it is accessible remotely and helps. For other ports please refer to:
yeah it is happenning with all the jobs and after some time they are picking some other port and start working fine
what is the thing those jobs trying to find in the port are they trying to find the node manager
for the successfull jobs i went to the port and did ps -wwf port number where i can find node manager running there