Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this topic

spark error after upgrade to cdh5.50

New Contributor

Hi, we have just upgraded our cluster to cdh5.50. After the upgrade, formerly developed spark application cannot run. Even a simplest wordcount raise errors. There is no error or warning in cloudera manager. Errors in the logs are as follows:

 

15/11/24 15:04:12 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://driverPropsFetcher@shgc03:55655]: Error [Shut down address: akka.tcp://driverPropsFetcher@shgc03:55655] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://driverPropsFetcher@shgc03:55655
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$
15/11/24 15:04:13 INFO YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@shgc03:58430/user/Executor#-809057478]) with ID 1
15/11/24 15:04:13 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 1)
15/11/24 15:04:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, shgc03, partition 0,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:13 INFO BlockManagerMasterEndpoint: Registering block manager shgc03:43239 with 530.3 MB RAM, BlockManagerId(1, shgc03, 43239)
15/11/24 15:04:13 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://driverPropsFetcher@shgc02:55596]: Error [Shut down address: akka.tcp://driverPropsFetcher@shgc02:55596] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://driverPropsFetcher@shgc02:55596
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$
15/11/24 15:04:13 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on shgc03:43239 (size: 1859.0 B, free: 530.3 MB)
15/11/24 15:04:13 INFO YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@shgc02:34693/user/Executor#-143872230]) with ID 2
15/11/24 15:04:13 INFO ExecutorAllocationManager: New executor 2 has registered (new total is 2)
15/11/24 15:04:13 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, shgc02, partition 1,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:13 INFO BlockManagerMasterEndpoint: Registering block manager shgc02:36398 with 530.3 MB RAM, BlockManagerId(2, shgc02, 36398)
15/11/24 15:04:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on shgc03:43239 (size: 22.0 KB, free: 530.3 MB)
15/11/24 15:04:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on shgc02:36398 (size: 1859.0 B, free: 530.3 MB)
15/11/24 15:04:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on shgc02:36398 (size: 22.0 KB, free: 530.3 MB)
15/11/24 15:04:14 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1674 ms on shgc03 (1/2)
15/11/24 15:04:15 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1534 ms on shgc02 (2/2)
15/11/24 15:04:15 INFO DAGScheduler: ResultStage 0 (count at App.scala:15) finished in 6.555 s
15/11/24 15:04:15 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/11/24 15:04:15 INFO DAGScheduler: Job 0 finished: count at App.scala:15, took 6.725293 s
15/11/24 15:04:15 INFO SparkContext: Starting job: count at App.scala:16
15/11/24 15:04:15 INFO DAGScheduler: Got job 1 (count at App.scala:16) with 2 output partitions
15/11/24 15:04:15 INFO DAGScheduler: Final stage: ResultStage 1(count at App.scala:16)
15/11/24 15:04:15 INFO DAGScheduler: Parents of final stage: List()
15/11/24 15:04:15 INFO DAGScheduler: Missing parents: List()
15/11/24 15:04:15 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at filter at App.scala:16), which has no missing parents
15/11/24 15:04:15 INFO MemoryStore: ensureFreeSpace(3184) called with curMem=222036, maxMem=556038881
15/11/24 15:04:15 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 3.1 KB, free 530.1 MB)
15/11/24 15:04:15 INFO MemoryStore: ensureFreeSpace(1861) called with curMem=225220, maxMem=556038881
15/11/24 15:04:15 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1861.0 B, free 530.1 MB)
15/11/24 15:04:15 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.0.0.200:58136 (size: 1861.0 B, free: 530.3 MB)
15/11/24 15:04:15 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:861
15/11/24 15:04:15 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at filter at App.scala:16)
15/11/24 15:04:15 INFO YarnScheduler: Adding task set 1.0 with 2 tasks
15/11/24 15:04:15 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, shgc02, partition 0,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:15 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, shgc03, partition 1,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:15 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on shgc02:36398 (size: 1861.0 B, free: 530.3 MB)
15/11/24 15:04:15 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on shgc03:43239 (size: 1861.0 B, free: 530.3 MB)
15/11/24 15:04:15 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 73 ms on shgc02 (1/2)
15/11/24 15:04:15 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 120 ms on shgc03 (2/2)
15/11/24 15:04:15 INFO DAGScheduler: ResultStage 1 (count at App.scala:16) finished in 0.122 s
15/11/24 15:04:15 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
15/11/24 15:04:15 INFO DAGScheduler: Job 1 finished: count at App.scala:16, took 0.147709 s
Lines with a: 4, Lines with b: 3
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
15/11/24 15:04:15 INFO SparkUI: Stopped Spark web UI at http://10.0.0.200:4041
15/11/24 15:04:15 INFO DAGScheduler: Stopping DAGScheduler
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Shutting down all executors
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Interrupting monitor thread
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Asking each executor to shut down
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Stopped
15/11/24 15:04:15 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://sparkExecutor@shgc03:58430]: Error [Shut down address: akka.tcp://sparkExecutor@shgc03:58430] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://sparkExecutor@shgc03:58430
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$
15/11/24 15:04:15 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://sparkExecutor@shgc02:34693]: Error [Shut down address: akka.tcp://sparkExecutor@shgc02:34693] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://sparkExecutor@shgc02:34693
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$

 

Anyone kindly give a hand?

Who agreed with this topic