Support Questions
Find answers, ask questions, and share your expertise

Delay connecting to ResourceManager when running Spark2 jobs

Delay connecting to ResourceManager when running Spark2 jobs

Hi folks,

this is a new deployment and when running some Spark2 test jobs, sometimes there is a delay of 30 seconds when connecting to the ResourceManager.

Here is the output when running fine:

18/12/13 16:22:27 INFO SparkContext: Running Spark version 2.2.0.2.6.3.0-235
18/12/13 16:22:28 INFO SparkContext: Submitted application: Spark Pi
18/12/13 16:22:28 INFO SecurityManager: Changing view acls to: testuser1
18/12/13 16:22:28 INFO SecurityManager: Changing modify acls to: testuser1
18/12/13 16:22:28 INFO SecurityManager: Changing view acls groups to:
18/12/13 16:22:28 INFO SecurityManager: Changing modify acls groups to:
18/12/13 16:22:28 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(testuser1); groups with view permissions: Set(); users  with modify permissions: Set(testuser1); groups with modify permissions: Set()
18/12/13 16:22:29 INFO Utils: Successfully started service 'sparkDriver' on port 43301.
18/12/13 16:22:29 INFO SparkEnv: Registering MapOutputTracker
18/12/13 16:22:29 INFO SparkEnv: Registering BlockManagerMaster
18/12/13 16:22:29 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/12/13 16:22:29 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/12/13 16:22:29 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-1acde062-860f-48f4-b3cd-0140739bb0c8
18/12/13 16:22:29 INFO MemoryStore: MemoryStore started with capacity 93.3 MB
18/12/13 16:22:29 INFO SparkEnv: Registering OutputCommitCoordinator
18/12/13 16:22:29 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/12/13 16:22:29 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.123.111:4040
18/12/13 16:22:30 INFO SparkContext: Added JAR file:/usr/hdp/2.6.3.0-235/spark2/examples/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar at spark://192.168.123.111:43301/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar with timestamp 1544714550011
18/12/13 16:22:30 INFO RMProxy: Connecting to ResourceManager at node1473.domain.com/192.168.123.112:8050
18/12/13 16:22:31 INFO Client: Requesting a new application from cluster with 24 NodeManagers



And here is the output when running with 30 secs delay (last two lines):

18/12/13 16:18:19 INFO SparkContext: Running Spark version 2.2.0.2.6.3.0-235
18/12/13 16:18:20 INFO SparkContext: Submitted application: Spark Pi
18/12/13 16:18:20 INFO SecurityManager: Changing view acls to: testuser1
18/12/13 16:18:20 INFO SecurityManager: Changing modify acls to: testuser1
18/12/13 16:18:20 INFO SecurityManager: Changing view acls groups to:
18/12/13 16:18:20 INFO SecurityManager: Changing modify acls groups to:
18/12/13 16:18:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(testuser1); groups with view permissions: Set(); users  with modify permissions: Set(testuser1); groups with modify permissions: Set()
18/12/13 16:18:20 INFO Utils: Successfully started service 'sparkDriver' on port 37544.
18/12/13 16:18:20 INFO SparkEnv: Registering MapOutputTracker
18/12/13 16:18:20 INFO SparkEnv: Registering BlockManagerMaster
18/12/13 16:18:20 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/12/13 16:18:20 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/12/13 16:18:20 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-8206cdc9-fb85-4e99-95f8-a7d48164be10
18/12/13 16:18:20 INFO MemoryStore: MemoryStore started with capacity 93.3 MB
18/12/13 16:18:21 INFO SparkEnv: Registering OutputCommitCoordinator
18/12/13 16:18:21 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/12/13 16:18:21 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.123.111:4040
18/12/13 16:18:21 INFO SparkContext: Added JAR file:/usr/hdp/2.6.3.0-235/spark2/examples/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar at spark://192.168.123.111:37544/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar with timestamp 1544714301640
18/12/13 16:18:23 INFO RMProxy: Connecting to ResourceManager at node1473.domain.com/192.168.123.112:8050
18/12/13 16:18:53 INFO Client: Requesting a new application from cluster with 24 NodeManagers

The job is the same and the cluster doesn't have any load, it's empty.

Any suggestion?

Many thanks in advance,

Jorge.