Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎07-30-2016

Parallel Spark jobs stuck at accepted state in 'yarn' mode through Oozie

Hello,

 

I am trying to run two parallel Spark jobs through Oozie. These jobs execute successfully when Spark Master is set to local[*], however when I try to run these jobs with Spark Master as yarn, the workflow execution get stuck at the following state:

 

1. Two Oozie MR Jobs gets stuck at 95%

2. First Spark job gets into Running state and gets stuck at 10%

3. Second Spark job gets stuck into Accepted State.

 

I am using a CHD 5.8 Amazon EC2 cluster with one master and three slaves. I have set the following resource related settings through Cloudera Manager: 

 

1. Dynamic Resource Pool Configuration -> User Limits -> Default 10

2. Dynamic Resource Pool Configuration -> Resource Pools -> Max Running Apps 10

 

What could be the issue here? Any help is highly appriciated. Thanks in advance.

 

 

jobs.png

Posts: 177
Topics: 8
Kudos: 28
Solutions: 19
Registered: ‎07-16-2015

Re: Parallel Spark jobs stuck at accepted state in 'yarn' mode through Oozie

[ Edited ]

Hi,

 

Are you using spark on yarn ?

If yes, is there available container on yarn ?

 

Could you tell us your yarn configuration :

- vcpu/mem per nodes ?

- min mem and vcpu per container ?

 

Also, the requirement asked in vcpu and mem for your spark app ?

 

These information will let you determined if your "accepted" job is waiting for an avalaible container or not.