I struggle to find the complete guidance how to isolate and schedule jobs application GPUs in hdp3.0 ? With hdp3.0 GPU is as a native asset, so are only gpu isolation and scheduling the only improvements ?
For now (3.1.1/3.2.0) the capacity.CapacityScheduler is broken by a hardcoded enum containing only vCores and RAM parameters.
You just have to switch your scheduler class to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
You also want to replace "capacity" by "Fair" in the line
Your GPUs will not be visible on yarn ui2 but will still be on the NodeManagers, and most importantly, will be allocated properly.
It was a mess to find out indeed.