Created on 06-15-2015 09:09 PM - edited 09-16-2022 02:31 AM
Hello, implemented capacity scheduler in my hadoop env (3 node clsuster) with below listed queues. Also assigned certain users to queues, but when I am running job as one of the user assigned to a particular queue the job is running under default queue rather than the assigned queue. When I am running the job I am not specifying the queue in the MR Job command line.
But when the job is run by mentioning the assigned queue using –D mapreduce.job.queuename, then it runs under the mentioned queue.
Tried with fair scheduler as well and found the same behavior.
My understanding is once these queues are defined and users allocated to them. And when users run the jobs, the jobs should automatically (without the need to change the jobs to specify the queue name) get assigned to allocated queues. Please let me know if this is not how it works.
[root@xxxxxxxx ~]# cat /etc/gphd/hadoop/conf/fair-scheduler.xml
<allocations>
<queue name="default">
<minResources>14417mb,22vcores</minResources>
<maxResources>1441792mb,132vcores</maxResources>
<maxRunningApps>50</maxRunningApps>
<weight>10</weight>
<schedulingPolicy>fifo</schedulingPolicy>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
</queue>
<queue name="framework">
<minResources>14417mb, 88vcores</minResources>
<maxResources>1441792mb, 132vcores</maxResources>
<maxRunningApps>5</maxRunningApps>
<weight>30</weight>
<schedulingPolicy>fair</schedulingPolicy>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
<aclSubmitApps>userx,svc_cpsi_s1,svc_pusd_s1,svc_ssyr_s1,svc_susd_s1</aclSubmitApps>
</queue>
<queue name="transformation">
<minResources>14417mb, 88vcores</minResources>
<maxResources>1441792mb, 132vcores</maxResources>
<maxRunningApps>5</maxRunningApps>
<weight>20</weight>
<schedulingPolicy>fair</schedulingPolicy>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
<aclSubmitApps>svc_bdli_s1,svc_bdlt_s1,svc_bdlm_s1</aclSubmitApps>
</queue>
<userMaxAppsDefault>50</userMaxAppsDefault>
<fairSharePreemptionTimeout>6000</fairSharePreemptionTimeout>
<defaultQueueSchedulingPolicy>fifo</defaultQueueSchedulingPolicy>
<queuePlacementPolicy>
<rule name="specified" create="false" />
<rule name="user" create="false" />
<rule name="group" create="false" />
<rule name="default" />
</queuePlacementPolicy>
</allocations>
The job is being run as userx - who as per config file should be part of framework queue
[userx@xxxxxxxx tmp]$ hadoop jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount /yarn/man*.txt /yarn/testout1 > /tmp/testout1
15/06/01 12:08:43 INFO client.RMProxy: Connecting to ResourceManager at xxxxxxxx.xyz.com/10.15.232.185:8032
15/06/01 12:08:43 INFO input.FileInputFormat: Total input paths to process : 5
15/06/01 12:08:43 INFO mapreduce.JobSubmitter: number of splits:5
15/06/01 12:08:44 INFO impl.YarnClientImpl: Submitted application application_1433174070996_0002 to ResourceManager at xxxxxxxx.xyz.com/10.15.232.185:8032
15/06/01 12:08:44 INFO mapreduce.Job: The url to track the job: http://xxxxxxxx.xyz.com:8088/proxy/application_1433174070996_0002/
15/06/01 12:08:44 INFO mapreduce.Job: Running job: job_1433174070996_0002
Instead of framework queue, the job is being run under root.admin queue.
[root@xxxxxxxx ~]# yarn application -list
15/06/01 12:09:02 INFO client.RMProxy: Connecting to ResourceManager at xxxxxxxx.xyz.com/10.15.232.185:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1433174070996_0002 word count MAPREDUCE userx root.userx RUNNING UNDEFINED 5% http://xxxxxxxx.xyz.com:39001
Created 06-16-2015 06:09 PM
Specifying an acl does not mean that a job gets placed into that queue. Your placement policies somehow need to put the job into the queue based on the information you have inside the config. Check the placement policies and how they work as per FairScheduler configuration.
Also your "yarn application -list" does not correspond to the configuration you have given: the root.userx queue should not exist or run any applications based on the queue placement policy for the fair scheduler (policies have create="false")
The capacity scheduler does not have a placement policy and would, I assume, dump ithe appllication in the first queue that is allowed.
Wilfred
Created 06-18-2015 08:32 AM
Created 06-18-2015 09:39 PM
All create options are false which means that you can only have the queues that are in the config (user, primary group and specified)
The specified rule will look at the job config and if you have "mapreduce.job.queuename" then that rule will trigger. Again only if the queue exists.
If none of those rules apply then default will trigger. I think you have not specified the correct queue in your job...
For the acls: if you have admin rights you also have submit rights. If you have submit rights then you do not have admin rights.
Admin rights in a queue will only give you kill on any application in that queue extra. submit acls are enforced as are the admin acls. For both to work you must have a yarn.admin.acl configured and not set to "*" since that wil lmake any user a yarn admin.
Wilfred
Created 06-22-2015 08:30 AM
Thanks Wilfred, so you mean everytime a user submits the job then they should specify the queue name in - mapred.job.queue.name parameter
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount -Dmapred.job.queue.name=alpha.alpha1 /yarn/man*.txt /yarn/testout1
When I run the job specifiying queue name, it runs in the mentioned queue.
So do we always need to specify the queue name? Wont it assign the queue explicitly based on the
submitACL username/groups.
Created 06-22-2015 07:33 PM
The rules are always executed. If you have a specified rule then that rule will check the value of "mapred.job.queue.name" you pass in.
Otherwise you will check the other rules in the list like the user or group based rules. The ACL check is perfromed after a rule provides a queue name.
If we look at the user rule for user "test_user" example: <rule name="user" create="false" />
This rule builds the queue name and returns the queue: "root.test_user" as the queue to put the application. It will only do that if the queue already exists in the FairScheduler config. If the queue exists it will check if "test_user" is allowed to submit an application in this queue by checking the ACL's of the "root.test_user" queue and the "root" queue (i.e. the parent). The ACL's that are checked are both the submit and admin ACL's
So the generic steps are for each rule:
- build the queue name (rule dependent)
- check if exist if create is false, stop if this check fails (return no queue name)
- check the submit ACL and then the admin ACL for the queue
- if ACL check fails check the parent queues all the way to root for access: submit and admin ACL on each level: if no access return no queue name
- return queue name
These steps are repeated for each rule in the order the rules are configured until a queue is found or no more rules are available.
Does that explain it?
Wilfred
Created 06-25-2015 09:34 AM
Thanks Wilfred, yes it explains. So what I understood is that aclSubmit apps would just check if a user is allowed to run in the queue determined by the queue placement policy. And queue placement policy based on rules.
Please help with below points -
1. Fair scheduler - Say I create queues xyz and default in the config file. Want to have "user1"'s jobs to under xyz queue, does he always need to specify the queue name in the command line to get it placed under xyz? Or is there any way in the config file or create a custom rule where I can specify that whenever user1 runs any jobs (without specifying queue name), they should be placed under xyz queue.
2. How would I do the same above thing but in capacity scheduler?
Created 06-28-2015 11:31 PM
That is correct. For the two questions. Answers:
1) yes you will need to supply the setting every time. There is not something like: "if user == X the queue = Y", you can write your own rule for it if you wanted one. The rules can be added.
2) The capacisty scheduler has a completly different config and does not have placement rules or something like that, see CapacityScheduler config documentation. They just have ACLs nothing else. So you would need to use the same setting every single time.
Wilfred
Created 07-02-2015 07:50 AM
Thanks much Wilfred.
In response to (1) reply -
yes you will need to supply the setting every time. There is not something like: "if user == X the queue = Y", you can write your own rule for it if you wanted one. The rules can be added.
-- I am just wondering how rules can be written to assign users to the queues in the rules, i don't any see users and queue specification section in rules. I mean to ask how is it possible to add rules to assign users to queues the want my requirement is i.e. if user == X the queue = Y"