Created on 12-14-2016 03:22 PM - edited 08-18-2019 05:43 AM
Hi all,
I`m operating a 16 node Hortonworks (Teradata Apliance) cluster in a mid-size TelCo for few months now. We just completed an upgrade from Ambari 2.4 to 2.5 and updated all the Hadoop stack as well. The cluster is in secure mode using Kerberos and Ranger, and has a YARN Capacity Scheduler configured with the following configuration:
yarn.scheduler.capacity.root.queues=prd_oper, prd_analyst, prd_am, dev_oper, dev_devs, tst_devs yarn.scheduler.capacity.root.capacity=100 yarn.scheduler.capacity.root.acl_administer_queue=* yarn.scheduler.capacity.root.accessible-node-labels=* yarn.scheduler.capacity.node-locality-delay=40 yarn.scheduler.capacity.maximum-applications=10000 yarn.scheduler.capacity.maximum-am-resource-percent=0.2 yarn.scheduler.capacity.default.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.tst_devs.user-limit-factor=1 yarn.scheduler.capacity.queue-mappings=u:user1:dev_devs yarn.scheduler.capacity.root.dev_devs.acl_administer_jobs= * yarn.scheduler.capacity.root.dev_devs.acl_administer_queue= * yarn.scheduler.capacity.root.dev_devs.acl_submit_applications= * yarn.scheduler.capacity.root.dev_devs.capacity=1 yarn.scheduler.capacity.root.dev_devs.maximum-capacity=90 yarn.scheduler.capacity.root.dev_devs.state=RUNNING yarn.scheduler.capacity.root.dev_devs.user-limit-factor=1 yarn.scheduler.capacity.root.dev_oper.acl_administer_jobs= * yarn.scheduler.capacity.root.dev_oper.acl_administer_queue= * yarn.scheduler.capacity.root.dev_oper.acl_submit_applications= * yarn.scheduler.capacity.root.dev_oper.capacity=1 yarn.scheduler.capacity.root.dev_oper.maximum-capacity=90 yarn.scheduler.capacity.root.dev_oper.state=RUNNING yarn.scheduler.capacity.root.dev_oper.user-limit-factor=1 yarn.scheduler.capacity.root.prd_am.acl_administer_jobs= * yarn.scheduler.capacity.root.prd_am.acl_administer_queue= * yarn.scheduler.capacity.root.prd_am.acl_submit_applications= * yarn.scheduler.capacity.root.prd_am.capacity=1 yarn.scheduler.capacity.root.prd_am.maximum-capacity=90 yarn.scheduler.capacity.root.prd_am.state=RUNNING yarn.scheduler.capacity.root.prd_am.user-limit-factor=1 yarn.scheduler.capacity.root.prd_analyst.acl_administer_jobs= * yarn.scheduler.capacity.root.prd_analyst.acl_administer_queue= * yarn.scheduler.capacity.root.prd_analyst.acl_submit_applications= * yarn.scheduler.capacity.root.prd_analyst.capacity=10 yarn.scheduler.capacity.root.prd_analyst.maximum-capacity=90 yarn.scheduler.capacity.root.prd_analyst.state=RUNNING yarn.scheduler.capacity.root.prd_analyst.user-limit-factor=1 yarn.scheduler.capacity.root.prd_oper.acl_administer_jobs= * yarn.scheduler.capacity.root.prd_oper.acl_administer_queue= * yarn.scheduler.capacity.root.prd_oper.acl_submit_applications= * yarn.scheduler.capacity.root.prd_oper.capacity=80 yarn.scheduler.capacity.root.prd_oper.maximum-capacity=90 yarn.scheduler.capacity.root.prd_oper.state=RUNNING yarn.scheduler.capacity.root.prd_oper.user-limit-factor=1 yarn.scheduler.capacity.root.tst_devs.acl_administer_jobs= * yarn.scheduler.capacity.root.tst_devs.acl_administer_queue= * yarn.scheduler.capacity.root.tst_devs.acl_submit_applications= * yarn.scheduler.capacity.root.tst_devs.capacity=7 yarn.scheduler.capacity.root.tst_devs.maximum-capacity=90 yarn.scheduler.capacity.root.tst_devs.state=RUNNING
With the upgrade of Ambari, a new setting is now available (or at least enforced) in the MapReduce2 configuration:
This now is setting the MapReduce2 queue to prd_oper, which is a valid queue as defined in the settings above. Running any map-reduce job will go to that queue.
PROBLEM:
All users will always try to use the prd_oper queue as defined in the above property. Even if you try to overwrite it with a setting like --hiveconf mapred.job.queuename=prd_am it will still go to prd_oper - i.e. the queue defined in the setting above.
This used to work fine before the upgrade when this option was not defined. I could control the queue mapping of each user/group within the Capacity Scheduler settings and I could post map/reduce jobs to any query that I need. I can`t remove this property via Ambari as it is mandatory, nor I can change it directly in mapred-site.xml as it gets overwriten by Ambari.
In contrast, Spark allows publishing to any query:
I need to restore the queueu mapping to what it use to be before the Upgrade. Any help will be appreciated!
Created 12-14-2016 05:44 PM
As you can see, adding this config in ambari will add it in mapred-site.xml which will be default value. As it is not a final value, if you set it from hiveconf, that will take precedence. I think this is a case of mismatched configs. If you see you are using mapred.job.queuename and not mapreduce.job.queuename. Try changing it to mapreduce.job.queuename and you should see if going to the right queue
Created on 12-15-2016 08:51 AM - edited 08-18-2019 05:43 AM
Thanks for the response,
I just ran this:
beeline -u "jdbc:hive2://local:10001/default;transportMode=http;httpPath=cliservice;principal=hive/_HOST@local.COM" -e "SELECT count(*) FROM log;" --hiveconf mapreduce.job.queuename=root.prd_am
It again went to prd_oper:
Does not seem to be it. Somehow the mapreduce setting is overwriting the hive setting...
Created 12-27-2016 10:56 AM
One work around that I just tested - run the beeline with the following queue parameter:
beeline -u "jdbc:hive2://local:10001/default;transportMode=http;httpPath=cliservice;principal=hive/_HOST@local.COM" -e "SELECT count(*) FROM log;" --hiveconf tez.queue.name=prd_am
This will request the query to be executed in the prd_am queue. If the user has access to that queue allowed in Ranger, it will work fine.
Still looking for a solution to use the default mapping defined in the YARN Capacity Scheduler configuration like:
yarn.scheduler.capacity.queue-mappings=u:user1:dev_devs, g:devs:dev_devs