Reply
New Contributor
Posts: 2
Registered: ‎02-18-2016
Accepted Solution

Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?  What are the main differences between these two schedulers?

Posts: 354
Topics: 162
Kudos: 60
Solutions: 27
Registered: ‎06-26-2013

Re: Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

The Fair Scheduler is recommended by Cloudera. Here is some background:

 

http://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/

New Contributor
Posts: 2
Registered: ‎05-04-2016

Re: Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

I am not able to understand the difference between Fair and capacity scheduler. From what I have read I understood that they both are identical except for the fact that capacity scheduler has FIFO for the users within a queue. I am not sure what this means and if this is the complete truth. So it will be really helpful if someone can explain this is plain and simple words.

Posts: 354
Topics: 162
Kudos: 60
Solutions: 27
Registered: ‎06-26-2013

Re: Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

This might clear things up:

 

Fair - Allocates resources to weighted pools, with fair sharing within each pool (docs).

Capacity - Allocates resources to pools, with FIFO scheduling within each pool (docs).

New Contributor
Posts: 2
Registered: ‎05-04-2016

Re: Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

Hi jkestelyn,

 

Thanks for the links, 

 

But this has caused even more confusion! in Apache documentation for Capacity scheduling it is mentioned 

Spoiler
When there is demand for these resources from queues running below capacity at a future point in time, as tasks scheduled on these resources complete, they will be assigned to applications on queues running below the capacity (pre-emption is not supported)

Where as in Cloudera documentation : Job Scheduling in Apache Hadoop

Spoiler
The Capacity Scheduler also supports configuring a wait time on each queue after which it is allowed to preempt other queues’ tasks if it is below its fair share

Both are contradictory! So, once again, please clarify what is the actual behavior and the difference between these two scheduling methods.

Posts: 354
Topics: 162
Kudos: 60
Solutions: 27
Registered: ‎06-26-2013

Re: Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

Hi,

 

The Cloudera "documentation" you reference here is actually an 8-year-old blog post. I would defer to the more current docs.