Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How does over provisioning split in fair scheduler?

How does over provisioning split in fair scheduler?

New Contributor

Hi All,

 

I have a question regarding YARN's dynamic resource pool configuration.

 

Let's say I have 3 pools:

- production   (weight=5)

- development   (weight=1)

- default   (weight=1)

 

Also, let's assume no one is using the "default" resource pool, and there are active jobs running on the two other resource pools.

 

I understand that in that case the default's resource will be used by the other users.

But how does those resources split?

Is it going to any user based on the fail scheduler?

Or is it being split 5:1 (production:development, respectively) between the active jobs in the other resource pools?

 

Thanks,

Eyal Yurman.

 

1 REPLY 1

Re: How does over provisioning split in fair scheduler?

Contributor

Hi Eyal,

 

Regarding your questions:

>  But how does those resources split?

Assuming there are no granular limitations being hit (i.e. max # of jobs per pool/queue/user) , the resources would be calculated by dividing by the sum up all of the weights from the queues/pools with running jobs, then reattributing the weighing.  So in your example, two queues are running development and production, which sums to 6.  Divided by 100, each weight is allocated 16.66667.  So the development pool would get 16.6667% of all of the cluster resources, and the production pool would get 83.3333% of all the cluster resources.  If the default pool receives a job submission, then the weighting is recalculated with 7 weights instead of 6;  Depending on how you have the various other Fair Scheduler properties setup, then various fairness actions would be taken (i.e. preemption, preemption timeouts, etc).

 

> Is it going to any user based on the fail scheduler?

No, as mentioned above, if no jobs are running in the default pool, then it is not calculated within the resources being used;  Only when there's a job submitted/running in that pool (default pool) is when it gets counted/calculated.

 

> Or is it being split 5:1 (production:development, respectively) between the active jobs in the other resource pools?

Correct :)

 

 

Hope this helps!

 

-- Anthony

 

Don't have an account?
Coming from Hortonworks? Activate your account here