Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Thousands of tasks in shufflemapstage

Thousands of tasks in shufflemapstage

New Contributor

Trying to figure out one of my regular Spark jobs (Spark 2.2 using RDDs) is taking sooo very long. I could have unrealistic expectations so wanted to check if any of the messages below should room for improvement / tuning.

We have a ten node cluster, 9 worker nodes, 128g RAM & 64 CPU on each node so plenty of capacity. Currently we submit the job with 8 executors and 3 CPU per executor. I'm currently trying to understand where the bottleneck is before I make any adjustments.

The majority of the time in the slow spark job is between these entries:

19/07/03 04:00:39 INFO DAGScheduler: Got job 2 (saveAsTable at test_job.java:3000) with 200 output partitions
...
19/07/03 04:14:06 INFO DAGScheduler: Job 2 finished: saveAsTable at test_job.java:3000, took 806.442023 s


In the same log, when I drill into what job 2 is up to, I can see:

19/07/03 04:04:53 INFO DAGScheduler: Submitting 40000 missing tasks from ShuffleMapStage 11 (MapPartitionsRDD[39] at toJavaRDD at test_job.java:980) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
19/07/03 04:04:53 INFO YarnClusterScheduler: Adding task set 11.0 with 40000 tasks 


It looks like this is the problem? for the next 10 mins or so I see 40000 lines that look like this:

Starting task 12345.0 in stage 11.0 (TID 2613, host1.abc.com, executor 7, partition 38570, NODE_LOCAL, 5254 bytes)

these tasks all run on nodes 1, 3 and 5, and are all NODE_LOCAL or PROCESS_LOCAL (not sure if that matters?)

Any ideas?