Support Questions
Find answers, ask questions, and share your expertise

YARN - Capacity Scheduler with FIFO policy

New Contributor

I am referring to this link - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_yarn_resource_mgt/content/flexible_sched... In this example, two queues have the same resources available. One uses the FIFO ordering policy, and the the other uses the Fair Sharing policy. A user submits three jobs to each queue one right after another, waiting just long enough for each job to start. The first job uses 6x the resource limit in the queue, the second 4x, and last 2x. In the FIFO queue, the 6x job would start and run to completion, then the 4x job would start and run to completion, and then the 2x job. They would start and finish in the order 6x, 4x, 2x. When I try capacity scheduler with default FIFO policy (on Hortonworks Data Platform) - I can see two jobs are running concurrently on same queue, which is not how it supposed to run based on above information.

2 REPLIES 2

Expert Contributor

That tells me that the Resource Manager determined that there were enough resources to run both jobs. Here are a couple of things to keep in mind while using the Capacity Scheduler:

  • Capacity Guarantees - Queues are allocated a fraction of the capacity of the grid in the sense that a certain capacity of resources will be at their disposal. All applications submitted to a queue will have access to the capacity allocated to the queue. Adminstrators can configure soft limits and optional hard limits on the capacity allocated to each queue.
  • Elasticity - Free resources can be allocated to any queue beyond it's capacity. When there is demand for these resources from queues running below capacity at a future point in time, as tasks scheduled on these resources complete, they will be assigned to applications on queues running below the capacity (pre-emption is not supported). This ensures that resources are available in a predictable and elastic manner to queues, thus preventing artifical silos of resources in the cluster which helps utilization.

https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html

New Contributor

@bhagan Thank you so much..