Support Questions
Find answers, ask questions, and share your expertise

Spark job always stuck at state : ACCEPTED

Spark job always stuck at state : ACCEPTED

New Contributor

Hi all, 

I am a hadoop admin for my cluster and there is an application that will stuck at state : ACCEPTED intermittently and turn to state : FAILED . The issue is intermittent, it happens once/twice for every 3-400 attempts. It means out of 300 attempts, 297 will run normally but 3 will fail intermittently. 

This is how the error happen normally : 

20/09/19 21:13:45 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:46 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:47 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:48 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:49 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:50 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:51 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:52 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:53 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:54 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:55 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:56 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:57 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:58 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:13:59 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:00 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:01 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:02 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:03 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:04 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:05 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:06 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:07 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:08 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:09 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:10 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:11 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:12 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:13 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:14 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:15 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:16 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:17 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:18 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:19 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:20 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:21 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:22 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:23 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:24 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:25 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:26 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:27 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:28 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:29 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:30 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:31 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:32 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:33 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:34 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:35 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:36 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:37 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:38 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:39 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:40 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:41 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:42 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:43 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:44 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:45 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:46 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:47 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:48 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:49 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:50 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:51 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:52 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:53 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:54 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:55 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:56 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:57 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:58 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:14:59 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:00 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:01 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:02 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:03 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:04 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:05 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:06 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:07 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:08 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:09 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:10 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:11 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:12 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:13 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:14 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:15 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:16 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:17 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:18 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:19 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:20 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:21 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:22 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:23 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:24 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:25 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:26 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:27 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:28 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:29 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:30 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:31 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:32 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:33 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:34 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:35 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:36 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:37 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:38 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:39 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:40 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:41 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:42 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:43 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:44 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:45 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:46 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:47 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:48 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:49 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:50 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:51 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:52 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:53 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:54 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:55 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:56 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:57 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:58 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:15:59 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:00 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:01 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:02 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:03 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:04 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:05 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:06 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:07 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:08 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:09 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:10 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:11 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:12 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:13 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:14 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:15 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:16 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:17 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:18 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:19 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:20 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:21 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:22 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:23 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:24 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:25 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:26 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:27 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:28 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:29 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:30 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:31 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:32 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:33 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:34 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:35 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:36 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:37 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:38 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:39 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:40 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:41 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:42 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:43 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:44 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:45 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:46 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:47 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:48 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:49 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:50 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:51 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:52 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:53 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:54 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:55 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:56 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:57 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:58 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:16:59 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:00 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:01 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:02 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:03 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:04 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:05 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:06 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:07 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:08 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:09 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:10 INFO Client: Application report for application_1599812013064_109877 (state: ACCEPTED)
20/09/19 21:17:11 INFO Client: Application report for application_1599812013064_109877 (state: FAILED)


Whereas normal behavior will be like this 

20/09/20 03:47:14 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:15 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:16 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:17 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:18 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:19 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:20 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:21 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:22 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:23 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:24 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:25 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:26 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:27 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:28 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:29 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:30 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:31 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:32 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:33 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:34 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:35 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:36 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:37 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:38 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:39 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:40 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:41 INFO Client: Application report for application_1599812013064_113242 (state: ACCEPTED)
20/09/20 03:47:42 INFO Client: Application report for application_1599812013064_113242 (state: RUNNING)

 

This is the error I will get after it's stuck in ACCEPTED state for very long

20/09/19 21:17:10 ERROR ApplicationMaster: Uncaught exception: 
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:345)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
20/09/19 21:17:10 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
20/09/19 21:17:10 INFO ApplicationMaster: Deleting staging directory hdfs://TSPSTREAM01/user/streamf10w/.sparkStaging/application_1599812013064_109876
20/09/19 21:17:10 INFO ShutdownHookManager: Shutdown hook called

 

NodeManger log at the data node side 

 

2020-09-19 21:15:27,144 WARN  nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:handleExitCode(591)) - Exception from container-launch with container ID: container_e62_1599812013064_109877_01_000001 and exit code: 13
org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:125)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:141)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:564)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:479)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:500)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:310)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:105)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

 




This is how the pyspark job is submitted

 

spark-submit \ 
--master yarn \ 
--deploy-mode cluster \ 
--queue xxx \ 
--driver-cores 1 \ 
--driver-memory 1g \ 
--num-executors 1 \ 
--executor-memory 1g \ 
--executor-cores 1 \ 
--conf spark.pyspark.python=/anaconda_env/projects/xxx/bin/python \ 
--conf spark.pyspark.driver.python=/anaconda_env/projects/xxx/bin/python \ 
--conf spark.executor.memoryOverhead=5120 \ 
--py-files xxx.py

 


Does anyone what would be the possible root cause? From what I see here is that intermittently, when the jobs are submitted to the cluster, it will stuck in ACCEPTED state for very long and turn it into FAILED. My cluster has plenty of resources 4800GB with 864 cores. This job will take normally average 9GB and 2 cores for a duration of 30s. 

Do I need to tune the parameter of spark-submit job to prevent this from happening?

Thank you in advance!