Created 05-11-2021 01:48 AM
Hi
We are periodically getting errors in our resource manager logs, which then cause no Spark jobs to be accepted until we restart the YARN ResourceManager.
We are on HDP 3.1.4 and we have Kerberos enabled.
The stack trace is below. Does anyone have ideas, please?
2021-05-08 04:15:59,407 ERROR yarn.YarnUncaughtExceptionHandler (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread Thread[Thread-20,5,main] threw an Exception.
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.ResourceCommitRequest.<init>(ResourceCommitRequest.java:101)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.createResourceCommitRequest(CapacityScheduler.java:2848)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2685)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1558)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1684)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1436)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:548)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:595)
Created 06-07-2021 11:12 PM
Hello @michael_boulter ,
the issue you hit looks like YARN-10295. It occurs when the CapacityScheduler Asynchronous scheduling is enabled.
The workaround for this issue is to set yarn.scheduler.capacity.schedule-asynchronously.enable to false in YARN's advanced scheduler configurations.
Hope it helps!
Kind regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 07-19-2021 09:58 AM
Thanks! Before you replied I tried the following which also helped:
yarn.scheduler.capacity.schedule-asynchronously.maximum-threads=2
yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms=100
Created 06-07-2021 11:12 PM
Hello @michael_boulter ,
the issue you hit looks like YARN-10295. It occurs when the CapacityScheduler Asynchronous scheduling is enabled.
The workaround for this issue is to set yarn.scheduler.capacity.schedule-asynchronously.enable to false in YARN's advanced scheduler configurations.
Hope it helps!
Kind regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 07-19-2021 09:58 AM
Thanks! Before you replied I tried the following which also helped:
yarn.scheduler.capacity.schedule-asynchronously.maximum-threads=2
yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms=100