Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Invalid resource request, requested resource type=[yarn.io/gpu]

SOLVED Go to solution
Highlighted

Invalid resource request, requested resource type=[yarn.io/gpu]

New Contributor

I'm facing this issue when try to using GPU on YARN:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException): Invalid resource request, requested resource type=[yarn.io/gpu] < 0 or greater than maximum allowed allocation. Requested resource=<memory:3072, vCores:1, yarn.io/gpu: 1>, maximum allowed allocation=<memory:9216, vCores:9>, please note that maximum allowed allocation is calculated by scheduler based on maximum resource of registered NodeManagers, which might be less than configured maximum allocation=<memory:9216, vCores:9, yarn.io/gpu: 9223372036854775807>

I already enabled GPU on my cluster but some how, it still showing that the (without yarn.io/gpu) maximum allowed allocation=<memory:9216, vCores:9>

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Invalid resource request, requested resource type=[yarn.io/gpu]

New Contributor

After about 2 weeks of various tries we finally settled on a full wipe of every host for a clean install from scratch.
Still nothing working.

Then we tried a "one worker" setup to set a countable resource manually to try the allocation mechanism and then....
NOTHING hortonWORKS !

But my Googling was better suited then.
It seems to be a Hadoop related issue about custom resources and CapacityScheduler, enjoy:

https://issues.apache.org/jira/browse/YARN-9161
https://issues.apache.org/jira/browse/YARN-9205


Temporary solution to benefit from isolation:

For now (3.1.1/3.2.0) the capacity.CapacityScheduler is broken by a hardcoded enum containing only vCores and RAM parameters. You just have to switch your scheduler class to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler You also want to replace "capacity" by "Fair" in the line yarn.scheduler.fair.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

Your GPUs will not be visible on yarn ui2 but will still be on the NodeManagers, and most importantly, will be allocated properly. It was a mess to find out indeed.

4 REPLIES 4

Re: Invalid resource request, requested resource type=[yarn.io/gpu]

New Contributor

Same exact problem. I have 2 GPUs in my test cluster, both are showing up (load included) in the RM / Nodes UI, but none of then can be allocated.... same "maximum allocation" reffering only to CPUs and RAM

Re: Invalid resource request, requested resource type=[yarn.io/gpu]

New Contributor

It seems to be about the ResourceCalculator used when requesting containers, as it shows only CPU/memory, like the DefaultResourceCalculator should do it. But Everywhere I check, my node registers his GPU properly and DominantResourceCalculator is set...

Re: Invalid resource request, requested resource type=[yarn.io/gpu]

New Contributor

After about 2 weeks of various tries we finally settled on a full wipe of every host for a clean install from scratch.
Still nothing working.

Then we tried a "one worker" setup to set a countable resource manually to try the allocation mechanism and then....
NOTHING hortonWORKS !

But my Googling was better suited then.
It seems to be a Hadoop related issue about custom resources and CapacityScheduler, enjoy:

https://issues.apache.org/jira/browse/YARN-9161
https://issues.apache.org/jira/browse/YARN-9205


Temporary solution to benefit from isolation:

For now (3.1.1/3.2.0) the capacity.CapacityScheduler is broken by a hardcoded enum containing only vCores and RAM parameters. You just have to switch your scheduler class to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler You also want to replace "capacity" by "Fair" in the line yarn.scheduler.fair.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

Your GPUs will not be visible on yarn ui2 but will still be on the NodeManagers, and most importantly, will be allocated properly. It was a mess to find out indeed.

Re: Invalid resource request, requested resource type=[yarn.io/gpu]

New Contributor

Have run into the same issue. It works with FairScheduler but not CapacityScheduler. To add to the instructions above for those who normally use CapacityScheduler (99.99% of the Hadoop population :-)) but want to try with FairScheduler, remember also to disable other CS specific features, such as Preemption as Resource Manager won't start otherwise.