Member since
01-16-2014
336
Posts
43
Kudos Received
31
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2909 | 12-20-2017 08:26 PM | |
2937 | 03-09-2017 03:47 PM | |
2496 | 11-18-2016 09:00 AM | |
4150 | 05-18-2016 08:29 PM | |
3236 | 02-29-2016 01:14 AM |
06-28-2015
11:31 PM
That is correct. For the two questions. Answers: 1) yes you will need to supply the setting every time. There is not something like: "if user == X the queue = Y", you can write your own rule for it if you wanted one. The rules can be added. 2) The capacisty scheduler has a completly different config and does not have placement rules or something like that, see CapacityScheduler config documentation. They just have ACLs nothing else. So you would need to use the same setting every single time. Wilfred
... View more
06-23-2015
05:29 PM
You need to start from the top: root first and then down. If you allow anybody to submit and admin the root queue there is nothing to enforce on the sub queues. As I explained in the linked post and as it shows in the explanantion in the screen shot. Wilfred
... View more
06-22-2015
07:57 PM
I can's see the screenshots that were added to the message but setting the ACL for the queue is described here follow the steps. You must add the "yarn" user to the overall YARN Admin ACL. Remove the root queue admin and submit ACL's otherwise things will not work. See my post in one of the other threads on this forum. The for the queue ACL you set the submit ACL for the production queue to: hdfs,user1 The test queue would get the submit ACL as: user2 For the default queue you can set the submit ACL to: "*" Wilfred
... View more
06-22-2015
07:33 PM
The rules are always executed. If you have a specified rule then that rule will check the value of "mapred.job.queue.name" you pass in. Otherwise you will check the other rules in the list like the user or group based rules. The ACL check is perfromed after a rule provides a queue name. If we look at the user rule for user "test_user" example: <rule name="user" create="false" /> This rule builds the queue name and returns the queue: "root.test_user" as the queue to put the application. It will only do that if the queue already exists in the FairScheduler config. If the queue exists it will check if "test_user" is allowed to submit an application in this queue by checking the ACL's of the "root.test_user" queue and the "root" queue (i.e. the parent). The ACL's that are checked are both the submit and admin ACL's So the generic steps are for each rule: - build the queue name (rule dependent) - check if exist if create is false, stop if this check fails (return no queue name) - check the submit ACL and then the admin ACL for the queue - if ACL check fails check the parent queues all the way to root for access: submit and admin ACL on each level: if no access return no queue name - return queue name These steps are repeated for each rule in the order the rules are configured until a queue is found or no more rules are available. Does that explain it? Wilfred
... View more
06-19-2015
03:24 AM
This error: main : command provided 1
main : user is abc
main : requested yarn user is abc
Container exited with a non-zero exit code 1 Looks like the exit code from the linux container executor. In cluster mode the driver runs inside the same container as the Application Master which makes a difference. As other people have said already get the logs from the containers by running: yarn logs -applicationId APPID Make sure that you run it as the user "abc" (same as the user that executes the spark command). Wilfred
... View more
06-18-2015
09:39 PM
All create options are false which means that you can only have the queues that are in the config (user, primary group and specified) The specified rule will look at the job config and if you have "mapreduce.job.queuename" then that rule will trigger. Again only if the queue exists. If none of those rules apply then default will trigger. I think you have not specified the correct queue in your job... For the acls: if you have admin rights you also have submit rights. If you have submit rights then you do not have admin rights. Admin rights in a queue will only give you kill on any application in that queue extra. submit acls are enforced as are the admin acls. For both to work you must have a yarn.admin.acl configured and not set to "*" since that wil lmake any user a yarn admin. Wilfred
... View more
06-16-2015
08:03 PM
Specifying worker opts in the client does not really make sense. The worker needs to know what it needs to clean up and it should be set on the worker. Try adding the whole string (that you have between the quotes to the "Additional Worker args" for the worker. Wilfred
... View more
06-16-2015
06:14 PM
Using the class path precedence is not the correct solution for all cases. A solution that will work in all cases is to use shading for the classes that you have modified versions of (use maven or gradle to do that). In your case you need to shade the parquet classes that you have modified when you package the jar. Be careful if you change classes like parquet: you could ed up with files that are only readable with your code and force you to keep packaging it with all jobs. That could cause problems later if you decide to use a different method to access the files. Wilfred
... View more
06-16-2015
06:09 PM
1 Kudo
Specifying an acl does not mean that a job gets placed into that queue. Your placement policies somehow need to put the job into the queue based on the information you have inside the config. Check the placement policies and how they work as per FairScheduler configuration. Also your "yarn application -list" does not correspond to the configuration you have given: the root.userx queue should not exist or run any applications based on the queue placement policy for the fair scheduler (policies have create="false") The capacity scheduler does not have a placement policy and would, I assume, dump ithe appllication in the first queue that is allowed. Wilfred
... View more
06-09-2015
08:28 PM
Can you check the path separator? I would have expected that on windows you would use the \ and not the / can you also explain how you start PySpark: do you use the cmd scripts or under cygwin? BTW: we do not test windows as a client, so you might see a known issue Wilfred
... View more