Reply
Explorer
Posts: 33
Registered: ‎07-27-2015

Re: Job stuck in Accepted state under a specific pool

Actually no. I didn't do any config changes. Yarn pool allocation is same as per above fair-scheduler.xml

"The spark job submitted to root.qatest is running actually (State is RUNNING according to your screenshot)." => It shows running but it will always waiting for the task container. AM container get assigned to the job but task container never get assigned. If i see pool usage, containers will be pending.

"It may also be helpful to look at the spark job log to see if there is any useful information there." => Same job is running fine on default queue.

I see no pattern after long monitoring but it never happen with "default" pool and if number of pools are more(i tried with 3-4) then it happens frequently.

Cant see anything wrong with the logs too. I am kind of running out of ideas.
Cloudera Employee
Posts: 55
Registered: ‎03-07-2016

Re: Job stuck in Accepted state under a specific pool

Do you still have the RM log when the job was stuck in qatest? There maybe indication there.

Explorer
Posts: 33
Registered: ‎07-27-2015

Re: Job stuck in Accepted state under a specific pool

sorry for late reply.
unfortunately i don't have logs now. I will share logs if i see the same situation again.
Highlighted
Explorer
Posts: 13
Registered: ‎11-21-2013

Re: Job stuck in Accepted state under a specific pool

[ Edited ]

I experience exactly the same issue. attaching 3 images to see what's happening. RM logs are nothing unusual...

 

Using CDH 5.4.5

cm-resource-pools.pngrm-gui-scheduler.pngaccepted resources.png

New Contributor
Posts: 1
Registered: ‎04-20-2016

Re: Job stuck in Accepted state under a specific pool

Increasing Zookeeper jute.maxbuffer has fixed the problem for me. Increased from 12 MB to 48 MB and did rolling restart to fix the issue.

 

Thanks

Surya