Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Fair Resource Management

Highlighted

Fair Resource Management

Explorer

Hello, I have configured fair scheduler using below fair-scheduler.xml on a cluster with 3 data nodes having 10GB memory each.

<?xml version="1.0"?>
<allocations>
  <defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
  <queue name="alpha">
    <weight>70</weight>
    <queue name="alpha1" />
      <weight>90</weight>
    <queue name="alpha2" />
      <weight>10</weight>
  </queue>
  <queue name="beta">
    <weight>30</weight>
    <queue name="beta1" />
    <queue name="beta2" />
  </queue>
  <queuePlacementPolicy>
    <rule name="specified" create="false" />
    <rule name="primaryGroup" create="false" />
    <rule name="default" queue="beta.beta1" />
  </queuePlacementPolicy>
</allocations>

1. I am running 6 MR jobs (i.e. each of them is using same 10 input files of size 500MB each i.e. 10 containers required and running the same wordcount program) at same time in queues - alpha1 (3 jobs) and alpha2 (3 jobs). I was expecting most of the resources to jobs running under alpha1 queue, but I see fair allocation of containers to both the alpha1 and alpha2 queue jobs. Why would this happen? Is it because of the scheduling policy defined as fair? How can I get resources allocated as per allocation?

Total jobs:6
                  JobId      State           StartTime      UserName           Queue      Priority       UsedContainers  RsvdContainers  UsedMem         RsvdMem         NeededMem         AM info
 job_1424751721561_0014    RUNNING       1424752404520          root     root.alpha2        NORMAL                    4               1   12288M           3072M            15360M      xyz.com:8088/proxy/application_1424751721561_0014/
 job_1424751721561_0009    RUNNING       1424752396872          root     root.alpha1        NORMAL                    5               0   15360M              0M            15360M      xyz.com:8088/proxy/application_1424751721561_0009/
 job_1424751721561_0011    RUNNING       1424752400028          root     root.alpha1        NORMAL                    3               1    9216M           3072M            12288M      xyz.com:8088/proxy/application_1424751721561_0011/
 job_1424751721561_0013    RUNNING       1424752403084          root     root.alpha2        NORMAL                    5               0   15360M              0M            15360M      xyz.com:8088/proxy/application_1424751721561_0013/
 job_1424751721561_0010    RUNNING       1424752398426          root     root.alpha1        NORMAL                    4               1   12288M           3072M            15360M      xyz.com:8088/proxy/application_1424751721561_0010/
 job_1424751721561_0012    RUNNING       1424752401631          root     root.alpha2        NORMAL                    3               0    9216M              0M             9216M      xyz.com:8088/proxy/application_1424751721561_0012/

When running 2 jobs in alpha1 queue, i see each job is getting 12

 15/02/23 23:31:26 INFO client.RMProxy: Connecting to ResourceManager at xyz.com/10.15.232.185:8032
Total jobs:2
                  JobId      State           StartTime      UserName           Queue      Priority       UsedContainers  RsvdContainers  UsedMem         RsvdMem         NeededMem         AM info
 job_1424751721561_0007    RUNNING       1424752237119          root     root.alpha1        NORMAL                   12               3   36864M           9216M            46080M      xyz.com:8088/proxy/application_1424751721561_0007/
 job_1424751721561_0008    RUNNING       1424752238780          root     root.alpha1        NORMAL                   12               0   36864M              0M            36864M      xyz.com:8088/proxy/application_1424751721561_0008/

2. As per my understanding the above queue placement should assign default queue to beta1 if I don't specify any queue and when there is no queue with my linux user or group name. But when I am submitting jobs without any queue name it does create queue based on the linux user name though primaryGroup has been defined as false. Am I missing something here? Please advice.

2 REPLIES 2
Highlighted

Re: Fair Resource Management

Master Guru
(1) I'm uncertain at the moment on if sub-queue weights are honoured, but I'll let others comment on this.

(2) You probably need to disable username queue creation via:

<queuePlacementPolicy>
<rule name="specified" create="false" />
<rule name="user" create="false" />
<rule name="primaryGroup" create="false" />
<rule name="default" queue="beta.beta1" />
</queuePlacementPolicy>

Re: Fair Resource Management

Cloudera Employee

<queue name="alpha1" />

  <weight>90</weight>

 

In the above queue config, the weight 90 is not associated with alpha1. For that to happen, you might want to change it to

 

<queue name="alpha1>

  <weight>90</weight>

</queue>

 

In your case, I suspect alpha1 and alpha2 are both taking the default weight "1" and the resources are being equally shared among the two queues. 

 

 

Karthik Kambatla
Software Engineer, Cloudera Inc.
Don't have an account?
Coming from Hortonworks? Activate your account here