I am having some trouble settings the following scheduler queues params:
have 2 queue Dev and Prod
(if only one used it should act as 100% of cluster)
Each queue is used by multiple users and resources should be shared equally, but when only one user exists(in each queue) it should use the entire capacity of the queue. And if the user alone in the cluster it should use 100% of the cluster in case of second user join, the scheduler should share the available resources
i want that the users will share the capacity of the cluster, each should receive 50%
It is set on fair, but when one user take all the resources and another user submit a job, the second job will not start untill the first job will finish..
for some reason Minimum User Limit have no effect
The default Scheduler in HDP is Capacity Scheduler. You should note the differences between all the 3 settings
The CapacityScheduler is designed to run Hadoop applications as a shared, multi-tenant cluster in an operator-friendly manner while maximizing the throughput and the utilization of the cluster.
Is the simplest scheduling algorithm. FIFO simply queues processes in the order that they arrive in the ready queue.
Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time
Having said that in your example above the PROD has 70% so despite "Each queue is used by multiple users and resources should be shared equally" the PROD queue jobs will have priority over the DEV queue which I think is the desired config.
Can you share the below values:
Capacity Scheduler values?
yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
currently Capacity Scheduler values abit differ from what i describe, cause of integration,
currently there are 3 queues default 50% bt and opt each at 25%
i have done the testing on default queue
yarn.scheduler.capacity.maximum-am-resource-percent=0.2 yarn.scheduler.capacity.maximum-applications=10000 yarn.scheduler.capacity.node-locality-delay=40 yarn.scheduler.capacity.root.accessible-node-labels=* yarn.scheduler.capacity.root.acl_administer_queue=yarn yarn.scheduler.capacity.root.capacity=100 yarn.scheduler.capacity.root.default.acl_submit_applications=yarn yarn.scheduler.capacity.root.default.capacity=50 yarn.scheduler.capacity.root.default.maximum-capacity=100 yarn.scheduler.capacity.root.default.state=RUNNING yarn.scheduler.capacity.root.default.user-limit-factor=2 yarn.scheduler.capacity.root.queues=bt,default,opt yarn.scheduler.capacity.queue-mappings-override.enable=false yarn.scheduler.capacity.root.bt.acl_administer_queue=* yarn.scheduler.capacity.root.bt.acl_submit_applications=* yarn.scheduler.capacity.root.bt.capacity=25 yarn.scheduler.capacity.root.bt.maximum-capacity=100 yarn.scheduler.capacity.root.bt.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.bt.ordering-policy=fair yarn.scheduler.capacity.root.bt.ordering-policy.fair.enable-size-based-weight=false yarn.scheduler.capacity.root.bt.priority=0 yarn.scheduler.capacity.root.bt.state=RUNNING yarn.scheduler.capacity.root.bt.user-limit-factor=1 yarn.scheduler.capacity.root.default.acl_administer_queue=yarn yarn.scheduler.capacity.root.default.minimum-user-limit-percent=25 yarn.scheduler.capacity.root.default.ordering-policy=fair yarn.scheduler.capacity.root.default.ordering-policy.fair.enable-size-based-weight=false yarn.scheduler.capacity.root.default.priority=0 yarn.scheduler.capacity.root.opt.acl_administer_queue=* yarn.scheduler.capacity.root.opt.acl_submit_applications=* yarn.scheduler.capacity.root.opt.capacity=25 yarn.scheduler.capacity.root.opt.maximum-capacity=25 yarn.scheduler.capacity.root.opt.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.opt.ordering-policy=fair yarn.scheduler.capacity.root.opt.ordering-policy.fair.enable-size-based-weight=false yarn.scheduler.capacity.root.opt.priority=0 yarn.scheduler.capacity.root.opt.state=RUNNING yarn.scheduler.capacity.root.opt.user-limit-factor=1 yarn.scheduler.capacity.root.priority=0
I made sub queues for each user and it works, but you have to manage it for each user, (we are small company and have few users so it manageable)
I configured this on our grid, but still looking for a better approach
In your case i would suggest using the following configuration:
Dev queue: Capacity 30% Max Capacity 70%, User limit Factor: 4, Oredering policy: Fair
Prod queue: Capacity 70% Max Capacity 30%, User limit Factor 2, Oredering policy: Fair
Make sure preemption is enabled in hive config.
These configs should give you the desired result - thus enabling each queue having 100% when the other queue is idle.
The trick is with "User Limit Factor" - which enables the dev queue to "steal" resources from the Prod queue up to 4 times of it's configured capacity (thus resulting in 100% percent in DEV while Prod is idle).
Thanks but i have done it already,