We have multiple spark streaming jobs (~200) running on the cluster. We are running it in "yarn-client" mode as we want to manage the stop & start capability.
When we run these jobs parallel, spark will require 200 ports for running the job.
How do we manage the ports?
Have you tried setting spark.ui.port while submitting the application in yarn client mode?
Please let me know if that works for you.
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
@Dhaval Modi Yes, the above suggestion is to manage ports manually. AFAIK by default spark tries to bind to 4040 and if used then it will try 4041 and so on until it finds an open port. I'm not aware of any other out of the box strategies.
Thanks for your reply. If I have ~200 jobs, it will be difficult to manage the ports manually. So looking for some strategy.