- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
YARN ResourceManager CapacityScheduler ERROR
Created ‎05-11-2021 01:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
We are periodically getting errors in our resource manager logs, which then cause no Spark jobs to be accepted until we restart the YARN ResourceManager.
We are on HDP 3.1.4 and we have Kerberos enabled.
The stack trace is below. Does anyone have ideas, please?
2021-05-08 04:15:59,407 ERROR yarn.YarnUncaughtExceptionHandler (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread Thread[Thread-20,5,main] threw an Exception.
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.ResourceCommitRequest.<init>(ResourceCommitRequest.java:101)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.createResourceCommitRequest(CapacityScheduler.java:2848)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2685)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1558)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1684)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1436)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:548)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:595)
Created ‎06-07-2021 11:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @michael_boulter ,
the issue you hit looks like YARN-10295. It occurs when the CapacityScheduler Asynchronous scheduling is enabled.
The workaround for this issue is to set yarn.scheduler.capacity.schedule-asynchronously.enable to false in YARN's advanced scheduler configurations.
Hope it helps!
Kind regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created ‎07-19-2021 09:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! Before you replied I tried the following which also helped:
yarn.scheduler.capacity.schedule-asynchronously.maximum-threads=2
yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms=100
Created ‎06-07-2021 11:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @michael_boulter ,
the issue you hit looks like YARN-10295. It occurs when the CapacityScheduler Asynchronous scheduling is enabled.
The workaround for this issue is to set yarn.scheduler.capacity.schedule-asynchronously.enable to false in YARN's advanced scheduler configurations.
Hope it helps!
Kind regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created ‎07-19-2021 09:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! Before you replied I tried the following which also helped:
yarn.scheduler.capacity.schedule-asynchronously.maximum-threads=2
yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms=100
