Created 04-04-2016 09:03 AM
I created two queues (it, price). I expect that when a user runs a job on a cluster he gets all the free resources of the cluster (77 containers in our case). However, the ituser1 uses only the resources available to its queue (32 containers). Is it possible to allow the ituser1 to use all available resources of the cluster?
Total number of containers in cluster - 77
yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler yarn.scheduler.capacity.maximum-am-resource-percent=0.2 yarn.scheduler.capacity.maximum-applications=10000 yarn.scheduler.capacity.node-locality-delay=40 yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator yarn.scheduler.capacity.root.accessible-node-labels=* yarn.scheduler.capacity.root.accessible-node-labels.default.capacity=-1 yarn.scheduler.capacity.root.accessible-node-labels.default.maximum-capacity=-1 yarn.scheduler.capacity.root.acl_administer_queue=* yarn.scheduler.capacity.root.capacity=100 yarn.scheduler.capacity.root.default-node-label-expression= yarn.scheduler.capacity.root.it.user-limit-factor=1 yarn.scheduler.capacity.root.price.user-limit-factor=1 yarn.scheduler.capacity.root.it.state=RUNNING yarn.scheduler.capacity.root.price.state=RUNNING yarn.scheduler.capacity.root.it.capacity=40 yarn.scheduler.capacity.root.price.capacity=60 yarn.scheduler.capacity.root.it.maximum-capacity=100 yarn.scheduler.capacity.root.price.maximum-capacity=100 yarn.scheduler.capacity.queue-mappings=u:ituser1:it,u:ituser2:it,u:ituser3:it,u:priceuser1:price,u:priceuser2:price,u:priceuser3:price yarn.scheduler.capacity.root.it.minimum-userlimit-Percent=50 yarn.scheduler.capacity.root.price.minimum-userlimit-Percent=30 yarn.scheduler.capacity.root.price.default.ordering-policy=fair yarn.scheduler.capacity.root.it.default.ordering-policy=fair yarn.scheduler.capacity.root.it.acl_administer_jobs=* yarn.scheduler.capacity.root.it.acl_submit_applications=* yarn.scheduler.capacity.root.price.acl_administer_jobs=* yarn.scheduler.capacity.root.price.acl_submit_applications=* yarn.scheduler.capacity.root.queues=it,price
Created 04-08-2016 02:06 PM
Hi @Alena Melnikova , there is a capacity and max capacity. The max is what determines elasticity but setting that alone is not enough. That is at queue level. You have to adjust the user-limit-factor so that a user can leverage more than just the capacity. You essentially are saying that a user can use X times capacity. For example, if you say 2 for user-limit-capacity in this example, the user will be able to leverage 80 percent (40x2) Capacity is the queue ceiling, you cannot set to 100 as the queues have to add up to 100 across root. I hope this helps.
Ian
Created 04-04-2016 09:10 AM
Hello Alena
On top of the queue distribution and elasticity there are other elements that can be configured to help share the ressources. For example you have .root.it.user-limit-factor=1 which means a user cannot use more than 100% of the allocated queue capacity, this can limit or negate the optionnal elasticity given to a queue. Try setting it to 2 and then 3 so see the result.
regards
Created 04-04-2016 10:05 AM
Hi nmaillard,
I tried that already.
yarn.scheduler.capacity.root.it.user-limit-factor=2 yarn.scheduler.capacity.root.price.user-limit-factor=1
In this case, the ituser1 picks up 63 containers, but if priceuser1 comes it this time, the ituser1 does not give him the vacant containers, he continues to use them for yourself. I expected ituser1 release 31 containers for priceuser1. But it did not happen. I guess because the ituser1 thinks that is eligible for 63 containers instead of 32.
Created 04-04-2016 12:32 PM
As you see when you increased the user-limit-factor it allocated more containers and you got 200% of the queue, now if you where to give 2,5 you would get the full queue. For the second part if you want the ituser queue to release the extra containers to service the price queue, you can either wait for it to happen naturally as the job rolls out or better set the yarn preemption mechanism.http://hortonworks.com/blog/better-slas-via-resource-preemption-in-yarns-capacityscheduler/
Created 04-04-2016 11:20 AM
You need to set the user-limit-factor to a higher number for the queue 'it'. If you want to let the user get all 77 containers, you should set it to 2.5.
From the documentation at
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html -
yarn.scheduler.capacity.<queue-path>.user-limit-factor The multiple of the queue capacity which can be configured to allow a single user to acquire more resources. By default this is set to 1 which ensures that a single user can never take more than the queue’s configured capacity irrespective of how idle the cluster is. Value is specified as a float.
Created 04-04-2016 02:30 PM
If I use user-limit-factor=2.5 then why do I need to set yarn.scheduler.capacity.root.it.capacity=40? I can set it yarn.scheduler.capacity.root.it.capacity=100. Result will be the same.
Is yarn.scheduler.capacity.root.it.capacity just lower limit?
Created 04-08-2016 02:06 PM
Hi @Alena Melnikova , there is a capacity and max capacity. The max is what determines elasticity but setting that alone is not enough. That is at queue level. You have to adjust the user-limit-factor so that a user can leverage more than just the capacity. You essentially are saying that a user can use X times capacity. For example, if you say 2 for user-limit-capacity in this example, the user will be able to leverage 80 percent (40x2) Capacity is the queue ceiling, you cannot set to 100 as the queues have to add up to 100 across root. I hope this helps.
Ian
Created 05-03-2016 12:43 PM
Hi @Ian Roberts, thanks for the clarification.
Created 06-11-2017 08:02 AM
hi , Roberts, thanks for your reply , i learned very much.
And i have a question here, if i set yarn.scheduler.capacity.root.it.maximum-capacity=60, and user-limit-capacity=2, then how much the the queue ceiling is ?
80(40*2) or 60(maximum-capacity)
thanks again