Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

yarn acl's for hive jobs are not in effect when doas set to false.

Explorer
In hive, when doas set to true, hive jobs are running as enduser(user who is executing the job) and yarn acl's set for queue are in affect but when doas set to false, all the hive jobs were run as hive user and yarn acl's are not in affect on enduser running the job.

In below scenario(doas set to false),  user 'user02' when running job in 'engineering01' Queue, where only 'user02' can submit application in Q, but hive job is failing with "User hive cannot submit applications to queue root.engineering01" error.

In this scenario, how does yarn acl's will affect for enduser and granting hive user to submit application is every queue is not applicable.
================
master01:~ # su - user02
-----------------
user02@master01:/root> mapred queue -showacls
17/10/15 19:42:27 INFO impl.TimelineClientImpl: Timeline service address: http://master01.teradata.com:8188/ws/v1/timeline/
17/10/15 19:42:28 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
Queue acls for user :  user02

Queue  Operations
=====================
root
default  SUBMIT_APPLICATIONS
engineering01  SUBMIT_APPLICATIONS
support01  ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
--------------------
user02@master01:/root> beeline -u "jdbc:hive2://localhost:10000/default" -n user02 -p user02
WARNING: Use "yarn jar" to launch YARN applications.
Connecting to jdbc:hive2://localhost:10000/default
Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.2.3.4.0-3485 by Apache Hive
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history


0: jdbc:hive2://localhost:10000/default> set tez.queue.name=engineering01;
No rows affected (0.061 seconds)

0: jdbc:hive2://localhost:10000/default> create table test09 as select * from employee01;
INFO  : Tez session hasn't been created yet. Opening session
ERROR : Failed to execute tez graph.
org.apache.tez.dag.api.TezException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1508070646645_0021 to YARN : org.apache.hadoop.security.AccessControlException: User hive cannot submit applications to queue root.engineering01
        at org.apache.tez.client.TezClient.start(TezClient.java:413)
        at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196)
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:271)
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:151)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1703)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1460)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1237)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1101)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1096)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
        at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
        at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1508070646645_0021 to YARN : org.apache.hadoop.security.AccessControlException: User hive cannot submit applications to queue root.engineering01
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:271)
        at org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72)
        at org.apache.tez.client.TezClient.start(TezClient.java:408)
        ... 22 more
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)
0: jdbc:hive2://localhost:10000/default>
4 REPLIES 4

@Turing nix

This appears to be working as designed.

In your example, YARN ACL allows 'user02' to submit jobs to engineering01 queue.

Now you have 2 scenarios:

1. When doAs=true in Hive : the job submitted to the queue will run as enduser.

Since user02 submits the job to this queue and YARN ACL allows user02 to do this, the job is accepted.

2. When doAs=false in Hive : the job submitted to the queue will run as user 'hive'.

Since user02 submit the job as 'hive' user to the queue and YARN ACL only allows user02, it correctly fails this time.

Update:

Whether doAs=true or doAs=false, you can still audit using Ranger.

You can look at the best practices for implementing this with Ranger.

https://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/

The article is for an old version of HDP but the concept is still valid.

Explorer

@Dinesh Chitlangia

When doas is set to false, all the hive jobs were ran as hive user, verified in RangerAudit and RM WebUI, hive jobs were shown as ran with hive user, than how to do auditing, like, which user has submitted what job at what time etc. If this is not possible, than this looks to be having a negative in hadoop.

@Turing nix - I have updated my original answer to address your concerns with auditing. Please refer the same and kindly consider accepting the answer if it helps you. Thank you.

@Turing nix - If my answer helps you, kindly consider accepting the answer so as to close mark this post resolved.