Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Explorer

HI,

     We work on CDH5.7.1 and Fair Scheduler. there is an issue, when we submit hive action with oozie, the hive MR job execute in queue root.submit**username. In fact, we have a queue root.default that config for user hive and group hive. and I have pointed the mapred.job.queue.name=root.default. The HIVE MR job always woked with root.submit**username.

     We did not want to create more username queue.

     How to resolve this issue?

     Thank you for your suggestions!!

BR

Paul

 

8 REPLIES 8

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Rising Star

What is your queuePlacementPolicy? You may have other polices that take precedence over this one you intended.

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Explorer

HI haibo,

 

there are our configruations in CM and I paste here,

 

{"defaultFairSharePreemptionThreshold":null,"defaultFairSharePreemptionTimeout":null,"defaultMinSharePreemptionTimeout":null,"defaultQueueSchedulingPolicy":"drf","queueMaxAMShareDefault":null,"queueMaxAppsDefault":null,"queuePlacementRules":[{"create":false,"name":"specified"},{"create":false,"name":"reject"}],"queues":[{"aclAdministerApps":"*","aclSubmitApps":" ","allowPreemptionFrom":null,"fairSharePreemptionThreshold":null,"fairSharePreemptionTimeout":null,"minSharePreemptionTimeout":null,"name":"root","queues":[{"aclAdministerApps":"*","aclSubmitApps":"hive hive","allowPreemptionFrom":false,"fairSharePreemptionThreshold":null,"fairSharePreemptionTimeout":null,"minSharePreemptionTimeout":null,"name":"default","queues":[],"schedulablePropertiesList":[{"impalaDefaultQueryMemLimit":null,"impalaMaxMemory":null,"impalaMaxQueuedQueries":null,"impalaMaxRunningQueries":null,"impalaQueueTimeout":null,"maxAMShare":null,"maxResources":null,"maxRunningApps":null,"minResources":null,"scheduleName":"default","weight":4.0}],"schedulingPolicy":"drf"},{"aclAdministerApps":"*","aclSubmitApps":"dejun,yunping,liang arch","allowPreemptionFrom":false,"fairSharePreemptionThreshold":null,"fairSharePreemptionTimeout":null,"minSharePreemptionTimeout":null,"name":"plarch","queues":[],"schedulablePropertiesList":[{"impalaDefaultQueryMemLimit":null,"impalaMaxMemory":null,"impalaMaxQueuedQueries":null,"impalaMaxRunningQueries":null,"impalaQueueTimeout":null,"maxAMShare":null,"maxResources":null,"maxRunningApps":null,"minResources":null,"scheduleName":"default","weight":3.0}],"schedulingPolicy":"drf"}],"schedulablePropertiesList":[{"impalaDefaultQueryMemLimit":null,"impalaMaxMemory":null,"impalaMaxQueuedQueries":null,"impalaMaxRunningQueries":null,"impalaQueueTimeout":null,"maxAMShare":null,"maxResources":null,"maxRunningApps":null,"minResources":null,"scheduleName":"default","weight":1.0}],"schedulingPolicy":"drf"}],"userMaxAppsDefault":null,"users":[]}

 

So, What is the problem? Please help me to check it.

Thanks in advance.
BR
Paul

 

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Explorer

Hi, haibo

 

could you please help me to check the config?

 

BR

Paul

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Rising Star

Dear Paul,

 

Here is a documentation about Queue Placement Policy:

 

http://blog.cloudera.com/blog/2016/06/untangling-apache-hadoop-yarn-part-4-fair-scheduler-queue-basi...

 

Best regards,

 

      Gabor

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Explorer

Hi Gabor,

 

Thanks for your quickly response, I should read the blog at a later time.

 

I should also like to point the workaround,  put the set mapred.job.queue.name=root.xxx in beeline console or hql script

when to run beeline or run hive2 action of oozie, the behavior will be correctly.

 

Thanks

BR

Paul

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Explorer

I have this same issue, where by the hive jobs via hiveserver2 are always run in queue root.user.hive.

I tried setting mapred.job.queue.name and mapreduce.job.queuename paramteres it did not help.

I guess this is due to we disable impersonation for Sentry for hiveserver2 so YARN always the job run as user "hive".

Does anyone have any work around to this? I would like the queue follow the actual user's group resource instead of root.user.hive.

 

I am running on CDH 5.15.0

 

Following Dynamic Resource Pool Configuration

Queue placement rule.

 

1              Use the pool root.[primary group], only if the pool exists.           

2              Use the pool root.users.[username] and create the pool if it does not exist.       

3              Use the pool Specified at run time. and create the pool if it does not exist.          

4              Use the pool root.default.

This rule is always satisfied. Subsequent rules are not used.

 

 

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Cloudera Employee

Hi,

 

 

The hive places the job in queue based on the user submitting job and applying the Queue placement rule based on that. So even if impersonation is disabled, the hive will still place the job based on the user submitting the job and not as user hive. When a user is submitting the job, hive will see the order and decide which queue to place the job in, and then specify the specific queue to post the job to yarn. So the job gets posted with a specific queue to yarn.

 

So it is recommended to keep the "Use the pool Specified at run time. and create the pool if it does not exist." as the first rule.

 

I would suggest to try the following order:

 

1  Use the pool Specified at run time. and create the pool if it does not exist.                        

2  Use the pool root.[primary group], only if the pool exists.           

3  Use the pool root.users.[username] and create the pool if it does not exist.       

4  Use the pool root.default.

 

With this setting, if you specify the queue name using  mapred.job.queue.name , then job will be posted to that queue. If you do not provide any queue name then , it will see the 2 or 3 rule and place the job to specific queue based on that rule. When the job is posted to yarn, the first rule is to used as hive has submitted using specific queue, so nothing changes at yarn level.

 

 

Highlighted

Re: Yarn Queue always is root.username when submit hive action with oozie when CDH5.7

Explorer

Thank you Bimalc, 

 

Placement order 'Use the pool Specified at run time.' did the trick. And I include 'only if the pool exists.' so that I do not want the end user have the control setting their queue. And the hive job will always place into user's primary group queue even if the queue does not exist. :)

 

1	Use the pool Specified at run time., only if the pool exists.	
2	Use the pool root.[primary group] and create the pool if it does not exist. 
This rule is always satisfied. Subsequent rules are not used.