Support Questions

Find answers, ask questions, and share your expertise

Yarn jobs are getting stuck in ACCEPTED state

avatar
New Contributor

I ve set up a new Ambari cluster. But cant run any jobs there using yarn because they all get stuck in ACCEPTED state(AM container waits for RM). If I go to the allocated container its state is RUNNING but there are no logs, only message that it is currently in LOCALIZING state.

Job fails eventually due to timeout issues

Ambari version is 2.6.2.0

Thanks

3 REPLIES 3

avatar
Master Collaborator

@Artyom Timofeev

As for containers are stuck in localizing phase, seems you are hit on this reported bug on Yarn which is resolved in 3.0.0 version.

https://issues.apache.org/jira/browse/YARN-6078

avatar
Explorer

@Jagadeesan A S I am using 3.0.0.0-1634. But I am still facing this issue. It is resolved in 3.0.0 or 3.1.0?

The annoying part is that the issue is random. It picks any job at any time. The same job runs fine and suddenly it fails with this issue and next time some other job might fail after couple of days which had successfully ran earlier and which will run properly in future. There is no other jobs running during that time.

avatar
Master Collaborator

@Suraj Singh

Actually particular fix is released in following version 3.1.0, 2.10.0, 2.9.1, 3.0.1 . Related JIRA https://issues.apache.org/jira/browse/YARN-7873. Fix you read complete comments then you will get idea about why they revert YARN-6078 and not released on 3.0.0.