Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

The spark workers does not get connected with Spark Master

Highlighted

The spark workers does not get connected with Spark Master

Explorer

Hi,

We're facing an issue with Spark in Production environments that the spark workers does not get connected with Spark Master. Please see the logs below and help to resolve this issue.

Master Log:

18/06/25 22:59:12 INFO master.Master: akka.tcp://sparkWorker@spark7:7084 got disassociated, removing it. 18/06/25 22:59:12 INFO master.Master: akka.tcp://sparkWorker@spark7:7084 got disassociated, removing it. 18/06/25 22:59:12 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7084] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7079 got disassociated, removing it. 18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7079 got disassociated, removing it. 18/06/25 22:59:28 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7079] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7082 got disassociated, removing it. 18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7082 got disassociated, removing it. 18/06/25 22:59:28 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7082] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 22:59:35 INFO master.Master: akka.tcp://sparkWorker@spark8:7081 got disassociated, removing it. 18/06/25 22:59:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark8:7081] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 22:59:35 INFO master.Master: akka.tcp://sparkWorker@spark8:7081 got disassociated, removing it. 18/06/25 23:00:23 INFO master.Master: akka.tcp://sparkWorker@spark9:7081 got disassociated, removing it. 18/06/25 23:00:23 INFO master.Master: akka.tcp://sparkWorker@spark9:7081 got disassociated, removing it. 18/06/25 23:00:23 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark9:7081] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 23:00:45 INFO master.Master: akka.tcp://sparkWorker@spark8:7085 got disassociated, removing it. 18/06/25 23:00:45 INFO master.Master: akka.tcp://sparkWorker@spark8:7085 got disassociated, removing it. 18/06/25 23:00:45 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark8:7085] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 23:00:48 INFO master.Master: akka.tcp://sparkWorker@spark7:7083 got disassociated, removing it. 18/06/25 23:00:48 INFO master.Master: akka.tcp://sparkWorker@spark7:7083 got disassociated, removing it. 18/06/25 23:00:48 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7083] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 18/06/25 23:01:52 INFO master.Master: akka.tcp://sparkWorker@spark0:7080 got disassociated, removing it. 18/06/25 23:01:52 INFO master.Master: akka.tcp://sparkWorker@spark0:7080 got disassociated, removing it. 18/06/25 23:01:52 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark0:7080] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].


Worker Log


18/06/25 22:43:56 INFO util.Utils: Successfully started service 'sparkWorker' on port 7081. 18/06/25 22:43:56 INFO worker.Worker: Starting Spark worker HKLPADBID09:7081 with 4 cores, 16.0 GB RAM 18/06/25 22:43:56 INFO worker.Worker: Running Spark version 1.4.1-palantir3 18/06/25 22:43:56 INFO worker.Worker: Spark home: /opt/palantir/spark-1.4.1-palantir3-bin-hadoop2.4 18/06/25 22:43:56 INFO server.Server: jetty-8.y.z-SNAPSHOT 18/06/25 22:43:56 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:8084 18/06/25 22:43:56 INFO util.Utils: Successfully started service 'WorkerUI' on port 8084. 18/06/25 22:43:56 INFO ui.WorkerWebUI: Started WorkerWebUI at http://SPARK:8084 18/06/25 22:43:56 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:44:10 INFO worker.Worker: Retrying connection to master (attempt # 1) 18/06/25 22:44:10 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:44:24 INFO worker.Worker: Retrying connection to master (attempt # 2) 18/06/25 22:44:24 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:44:38 INFO worker.Worker: Retrying connection to master (attempt # 3) 18/06/25 22:44:38 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:44:52 INFO worker.Worker: Retrying connection to master (attempt # 4) 18/06/25 22:44:52 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:45:06 INFO worker.Worker: Retrying connection to master (attempt # 5) 18/06/25 22:45:06 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:45:20 INFO worker.Worker: Retrying connection to master (attempt # 6) 18/06/25 22:45:20 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:46:42 INFO worker.Worker: Retrying connection to master (attempt # 7) 18/06/25 22:46:42 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:48:04 INFO worker.Worker: Retrying connection to master (attempt # 8) 18/06/25 22:48:04 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:49:26 INFO worker.Worker: Retrying connection to master (attempt # 9) 18/06/25 22:49:26 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:50:48 INFO worker.Worker: Retrying connection to master (attempt # 10) 18/06/25 22:50:48 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:52:10 INFO worker.Worker: Retrying connection to master (attempt # 11) 18/06/25 22:52:10 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:53:32 INFO worker.Worker: Retrying connection to master (attempt # 12) 18/06/25 22:53:32 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:54:54 INFO worker.Worker: Retrying connection to master (attempt # 13) 18/06/25 22:54:54 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:56:16 INFO worker.Worker: Retrying connection to master (attempt # 14) 18/06/25 22:56:16 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:57:38 INFO worker.Worker: Retrying connection to master (attempt # 15) 18/06/25 22:57:38 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 22:59:00 INFO worker.Worker: Retrying connection to master (attempt # 16) 18/06/25 22:59:00 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master... 18/06/25 23:00:22 ERROR worker.Worker: All masters are unresponsive! Giving up. 18/06/25 23:00:22 INFO util.Utils: Shutdown hook called

3 REPLIES 3

Re: The spark workers does not get connected with Spark Master

Explorer

@Geoffrey Shelton Okot @adash Seeking your help to fix the issue which mentioned above. Thanks.

Re: The spark workers does not get connected with Spark Master

Mentor

@Saravana V

Can you check the ports in the Java code and the Akka configuration match

Re: The spark workers does not get connected with Spark Master

Explorer

@Geoffrey Shelton Okot I see the port mentioned in config is correct and how to check the port in java code.

Don't have an account?
Coming from Hortonworks? Activate your account here