New Contributor
Posts: 2
Registered: ‎11-23-2015
spark error after upgrade to cdh5.50

Hi, we have just upgraded our cluster to cdh5.50. After the upgrade, formerly developed spark application cannot run. Even a simplest wordcount raise errors. There is no error or warning in cloudera manager. Errors in the logs are as follows:

 

15/11/24 15:04:12 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://driverPropsFetcher@shgc03:55655]: Error [Shut down address: akka.tcp://driverPropsFetcher@shgc03:55655] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://driverPropsFetcher@shgc03:55655
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$
15/11/24 15:04:13 INFO YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@shgc03:58430/user/Executor#-809057478]) with ID 1
15/11/24 15:04:13 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 1)
15/11/24 15:04:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, shgc03, partition 0,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:13 INFO BlockManagerMasterEndpoint: Registering block manager shgc03:43239 with 530.3 MB RAM, BlockManagerId(1, shgc03, 43239)
15/11/24 15:04:13 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://driverPropsFetcher@shgc02:55596]: Error [Shut down address: akka.tcp://driverPropsFetcher@shgc02:55596] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://driverPropsFetcher@shgc02:55596
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$
15/11/24 15:04:13 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on shgc03:43239 (size: 1859.0 B, free: 530.3 MB)
15/11/24 15:04:13 INFO YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@shgc02:34693/user/Executor#-143872230]) with ID 2
15/11/24 15:04:13 INFO ExecutorAllocationManager: New executor 2 has registered (new total is 2)
15/11/24 15:04:13 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, shgc02, partition 1,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:13 INFO BlockManagerMasterEndpoint: Registering block manager shgc02:36398 with 530.3 MB RAM, BlockManagerId(2, shgc02, 36398)
15/11/24 15:04:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on shgc03:43239 (size: 22.0 KB, free: 530.3 MB)
15/11/24 15:04:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on shgc02:36398 (size: 1859.0 B, free: 530.3 MB)
15/11/24 15:04:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on shgc02:36398 (size: 22.0 KB, free: 530.3 MB)
15/11/24 15:04:14 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1674 ms on shgc03 (1/2)
15/11/24 15:04:15 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1534 ms on shgc02 (2/2)
15/11/24 15:04:15 INFO DAGScheduler: ResultStage 0 (count at App.scala:15) finished in 6.555 s
15/11/24 15:04:15 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/11/24 15:04:15 INFO DAGScheduler: Job 0 finished: count at App.scala:15, took 6.725293 s
15/11/24 15:04:15 INFO SparkContext: Starting job: count at App.scala:16
15/11/24 15:04:15 INFO DAGScheduler: Got job 1 (count at App.scala:16) with 2 output partitions
15/11/24 15:04:15 INFO DAGScheduler: Final stage: ResultStage 1(count at App.scala:16)
15/11/24 15:04:15 INFO DAGScheduler: Parents of final stage: List()
15/11/24 15:04:15 INFO DAGScheduler: Missing parents: List()
15/11/24 15:04:15 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at filter at App.scala:16), which has no missing parents
15/11/24 15:04:15 INFO MemoryStore: ensureFreeSpace(3184) called with curMem=222036, maxMem=556038881
15/11/24 15:04:15 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 3.1 KB, free 530.1 MB)
15/11/24 15:04:15 INFO MemoryStore: ensureFreeSpace(1861) called with curMem=225220, maxMem=556038881
15/11/24 15:04:15 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1861.0 B, free 530.1 MB)
15/11/24 15:04:15 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.0.0.200:58136 (size: 1861.0 B, free: 530.3 MB)
15/11/24 15:04:15 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:861
15/11/24 15:04:15 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at filter at App.scala:16)
15/11/24 15:04:15 INFO YarnScheduler: Adding task set 1.0 with 2 tasks
15/11/24 15:04:15 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, shgc02, partition 0,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:15 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, shgc03, partition 1,NODE_LOCAL, 2220 bytes)
15/11/24 15:04:15 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on shgc02:36398 (size: 1861.0 B, free: 530.3 MB)
15/11/24 15:04:15 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on shgc03:43239 (size: 1861.0 B, free: 530.3 MB)
15/11/24 15:04:15 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 73 ms on shgc02 (1/2)
15/11/24 15:04:15 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 120 ms on shgc03 (2/2)
15/11/24 15:04:15 INFO DAGScheduler: ResultStage 1 (count at App.scala:16) finished in 0.122 s
15/11/24 15:04:15 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
15/11/24 15:04:15 INFO DAGScheduler: Job 1 finished: count at App.scala:16, took 0.147709 s
Lines with a: 4, Lines with b: 3
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
15/11/24 15:04:15 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
15/11/24 15:04:15 INFO SparkUI: Stopped Spark web UI at http://10.0.0.200:4041
15/11/24 15:04:15 INFO DAGScheduler: Stopping DAGScheduler
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Shutting down all executors
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Interrupting monitor thread
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Asking each executor to shut down
15/11/24 15:04:15 INFO YarnClientSchedulerBackend: Stopped
15/11/24 15:04:15 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://sparkExecutor@shgc03:58430]: Error [Shut down address: akka.tcp://sparkExecutor@shgc03:58430] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://sparkExecutor@shgc03:58430
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$
15/11/24 15:04:15 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@10.0.0.200:50785] <- [akka.tcp://sparkExecutor@shgc02:34693]: Error [Shut down address: akka.tcp://sparkExecutor@shgc02:34693] [
akka.remote.ShutDownAssociation: Shut down address: akka.tcp://sparkExecutor@shgc02:34693
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down.
]
akka.event.Logging$Error$NoCause$

 

Anyone kindly give a hand?

Who Me Too'd this topic