Hello everyone, I have such a problem, a small task that runs for about two minutes, after loading in yarn hung for 40 minutes, there was no load on the cluster, the restart was without problems and the task was completed in 2 minutes, someone faced such a problem and how to fix it? I can provide logs if necessary.
Originally posted by @Infectus as a community article. Re-posting it on the Support Questions board for Community members to answer.
The issue is very generic and there could be multiple possible reasons for job to be in pending state in yarn.
You can follow some steps as listed below to check the problem.
1. Check for the application log of the pending job.
yarn logs -applicationId <application id of job>
2. Which application job like hive,hue,oozie etc. For example, If it's a hive job sometimes it may happen the job is submitted to yarn but it may take long time to write metadata to hive metastore. This is where it was stuck. In my case job was stuck for more than an hour while writing metadata to hive metastore.The issue was because some metastore db backup was running during the same time and db was in readonly mode.
3. Enable Debug logging for resource manager logs and collect the log related to your application from resource manager.