Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?

avatar
Expert Contributor

Cloudera’s Fair Scheduler vs. Capacity Scheduler, which one is the best option to choose?  What are the main differences between these two schedulers?

1 ACCEPTED SOLUTION

avatar
Master Collaborator

The Fair Scheduler is recommended by Cloudera. Here is some background:

 

http://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/

View solution in original post

5 REPLIES 5

avatar
Master Collaborator

The Fair Scheduler is recommended by Cloudera. Here is some background:

 

http://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/

avatar
New Contributor

I am not able to understand the difference between Fair and capacity scheduler. From what I have read I understood that they both are identical except for the fact that capacity scheduler has FIFO for the users within a queue. I am not sure what this means and if this is the complete truth. So it will be really helpful if someone can explain this is plain and simple words.

avatar
Master Collaborator

This might clear things up:

 

Fair - Allocates resources to weighted pools, with fair sharing within each pool (docs).

Capacity - Allocates resources to pools, with FIFO scheduling within each pool (docs).

avatar
New Contributor

Hi jkestelyn,

 

Thanks for the links, 

 

But this has caused even more confusion! in Apache documentation for Capacity scheduling it is mentioned 

Spoiler
When there is demand for these resources from queues running below capacity at a future point in time, as tasks scheduled on these resources complete, they will be assigned to applications on queues running below the capacity (pre-emption is not supported)

Where as in Cloudera documentation : Job Scheduling in Apache Hadoop

Spoiler
The Capacity Scheduler also supports configuring a wait time on each queue after which it is allowed to preempt other queues’ tasks if it is below its fair share

Both are contradictory! So, once again, please clarify what is the actual behavior and the difference between these two scheduling methods.

avatar
Master Collaborator

Hi,

 

The Cloudera "documentation" you reference here is actually an 8-year-old blog post. I would defer to the more current docs.