Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

YARN - Jobs not running under assigned queues

avatar
Explorer

Hello, implemented capacity scheduler in my hadoop env (3 node clsuster) with below listed queues. Also assigned certain users to queues, but when I am running job as one of the user assigned to a particular queue the job is running under default queue rather than the assigned queue. When I am running the job I am not specifying the queue in the MR Job command line.

But when the job is run by mentioning the assigned queue using –D mapreduce.job.queuename, then it runs under the mentioned queue.

 

Tried with fair scheduler as well and found the same behavior. 

 

My understanding is once these queues are defined and users allocated to them. And when users run the jobs, the jobs should automatically (without the need to change the jobs to specify the queue name) get assigned to allocated queues. Please let me know if this is not how it works.

 

[root@xxxxxxxx ~]# cat /etc/gphd/hadoop/conf/fair-scheduler.xml

<allocations>

<queue name="default">

   <minResources>14417mb,22vcores</minResources>

   <maxResources>1441792mb,132vcores</maxResources>

   <maxRunningApps>50</maxRunningApps>

   <weight>10</weight>

   <schedulingPolicy>fifo</schedulingPolicy>

   <minSharePreemptionTimeout>300</minSharePreemptionTimeout>

</queue>

<queue name="framework">

   <minResources>14417mb, 88vcores</minResources>

   <maxResources>1441792mb, 132vcores</maxResources>

   <maxRunningApps>5</maxRunningApps>

   <weight>30</weight>

   <schedulingPolicy>fair</schedulingPolicy>

   <minSharePreemptionTimeout>300</minSharePreemptionTimeout>

   <aclSubmitApps>userx,svc_cpsi_s1,svc_pusd_s1,svc_ssyr_s1,svc_susd_s1</aclSubmitApps>

</queue>

   <queue name="transformation">

   <minResources>14417mb, 88vcores</minResources>

   <maxResources>1441792mb, 132vcores</maxResources>

   <maxRunningApps>5</maxRunningApps>

   <weight>20</weight>

   <schedulingPolicy>fair</schedulingPolicy>

   <minSharePreemptionTimeout>300</minSharePreemptionTimeout>

   <aclSubmitApps>svc_bdli_s1,svc_bdlt_s1,svc_bdlm_s1</aclSubmitApps>

</queue>

<userMaxAppsDefault>50</userMaxAppsDefault>

<fairSharePreemptionTimeout>6000</fairSharePreemptionTimeout>

<defaultQueueSchedulingPolicy>fifo</defaultQueueSchedulingPolicy>

   <queuePlacementPolicy>

   <rule name="specified" create="false" />

   <rule name="user" create="false" />

       <rule name="group" create="false" />

   <rule name="default" />

   </queuePlacementPolicy>

</allocations>

 

The job is being run as userx - who as per config file should be part of framework queue

[userx@xxxxxxxx tmp]$ hadoop jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount /yarn/man*.txt /yarn/testout1 > /tmp/testout1

15/06/01 12:08:43 INFO client.RMProxy: Connecting to ResourceManager at xxxxxxxx.xyz.com/10.15.232.185:8032

15/06/01 12:08:43 INFO input.FileInputFormat: Total input paths to process : 5

15/06/01 12:08:43 INFO mapreduce.JobSubmitter: number of splits:5

15/06/01 12:08:44 INFO impl.YarnClientImpl: Submitted application application_1433174070996_0002 to ResourceManager at xxxxxxxx.xyz.com/10.15.232.185:8032

15/06/01 12:08:44 INFO mapreduce.Job: The url to track the job: http://xxxxxxxx.xyz.com:8088/proxy/application_1433174070996_0002/

15/06/01 12:08:44 INFO mapreduce.Job: Running job: job_1433174070996_0002

 

Instead of framework queue, the job is being run under root.admin queue.

[root@xxxxxxxx ~]# yarn application -list

15/06/01 12:09:02 INFO client.RMProxy: Connecting to ResourceManager at xxxxxxxx.xyz.com/10.15.232.185:8032

Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1

               Application-Id     Application-Name       Application-Type         User           Queue                   State             Final-State             Progress                           Tracking-URL

application_1433174070996_0002           word count               MAPREDUCE       userx     root.userx               RUNNING               UNDEFINED                   5%   http://xxxxxxxx.xyz.com:39001

8 REPLIES 8

avatar
Super Collaborator

Specifying an acl does not mean that a job gets placed into that queue. Your placement policies somehow need to put the job into the queue based on the information you have inside the config. Check the placement policies and how they work as per FairScheduler configuration.

Also your "yarn application -list" does not correspond to the configuration you have given: the root.userx queue should not exist or run any applications based on the queue placement policy for the fair scheduler (policies have create="false")

 

The capacity scheduler does not have a placement policy and would, I assume, dump ithe appllication in the first queue that is allowed.

 

Wilfred

 

 

avatar
Explorer
Thanks Wilfred for the response.



I was trying different combinations and the root.userx must have been created while create was not false. I checked the documentation but unable to figure out what should be the queue placement in order for jobs submitted by users to be assigned to appropriate queues.


With the below queue placement the jobs are always being placed into default queue,


<queuePlacementPolicy>
<rule name="user" create="false" />
<rule name="primaryGroup" create="false" />
<rule name="specified" create="false" />
<rule name="default" />
</queuePlacementPolicy>


The documentation states that "Currently the only supported administrative action is killing an application. Anybody who may administer a queue may also submit applications to it. "

Does this mean aclSubmitApps functionality is not enabled/supported other than Admin ACL task of killing an application.

avatar
Super Collaborator

All create options are false which means that you can only have the queues that are in the config (user, primary group and specified)

The specified rule will look at the job config and if you have "mapreduce.job.queuename" then that rule will trigger. Again only if the queue exists.

 

If none of those rules apply then default will trigger. I think you have not specified the correct queue in your job...

 

For the acls: if you have admin rights you also have submit rights. If you have submit rights then you do not have admin rights.

Admin rights in a queue will only give you kill on any application in that queue extra. submit acls are enforced as are the admin acls. For both to work you must have a yarn.admin.acl configured and not set to "*" since that wil lmake any user a yarn admin.

 

Wilfred

avatar
Explorer

Thanks Wilfred, so you mean everytime a user submits the job then they should specify the queue name in - mapred.job.queue.name parameter

 

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount -Dmapred.job.queue.name=alpha.alpha1 /yarn/man*.txt /yarn/testout1

 

When I run the job specifiying queue name, it runs in the mentioned queue.

 

So do we always need to specify the queue name? Wont it assign the queue explicitly based on the

submitACL username/groups.

avatar
Super Collaborator

The rules are always executed. If you have a specified rule then that rule will check the value of "mapred.job.queue.name" you pass in.

Otherwise you will check the other rules in the list like the user or group based rules. The ACL check is perfromed after a rule provides a queue name.

 

If we look at the user rule for user "test_user" example: <rule name="user" create="false" />

This rule builds the queue name and returns the queue: "root.test_user" as the queue to put the application. It will only do that if the queue already exists in the FairScheduler config. If the queue exists it will check if "test_user" is allowed to submit an application in this queue by checking the ACL's of the "root.test_user" queue and the "root" queue (i.e. the parent). The ACL's that are checked are both the submit and admin ACL's

 

So the generic steps are for each rule:

- build the queue name (rule dependent)

- check if exist if create is false, stop if this check fails (return no queue name)

- check the submit ACL and then the admin ACL for the queue

- if ACL check fails check the parent queues all the way to root for access: submit and admin ACL on each level: if no access return no queue name

- return queue name

 

These steps are repeated for each rule in the order the rules are configured until a queue is found or no more rules are available.

 

Does that explain it?

Wilfred

avatar
Explorer

Thanks Wilfred, yes it explains. So what I understood is that aclSubmit apps would just check if a user is allowed to run in the queue determined by the queue placement policy. And queue placement policy based on rules.

Please help with below points -
1. Fair scheduler - Say I create queues xyz and default in the config file. Want to have "user1"'s jobs to under xyz queue, does he always need to specify the queue name in the command line to get it placed under xyz? Or is there any way in the config file or create a custom rule where I can specify that whenever user1 runs any jobs (without specifying queue name), they should be placed under xyz queue.
2. How would I do the same above thing but in capacity scheduler?

avatar
Super Collaborator

That is correct. For the two questions. Answers:

1) yes you will need to supply the setting every time. There is not something like: "if user == X the queue = Y", you can write your own rule for it if you wanted one. The rules can be added.

2) The capacisty scheduler has a completly different config and does not have placement rules or something like that, see CapacityScheduler config documentation. They just have ACLs nothing else. So you would need to use the same setting every single time.

 

Wilfred

avatar
Explorer

Thanks much Wilfred.

In response to (1) reply -
yes you will need to supply the setting every time. There is not something like: "if user == X the queue = Y", you can write your own rule for it if you wanted one. The rules can be added.
-- I am just wondering how rules can be written to assign users to the queues in the rules, i don't any see users and queue specification section in rules. I mean to ask how is it possible to add rules to assign users to queues the want my requirement is i.e. if user == X the queue = Y"