Community Articles

dkozlowski · ‎02-07-2017

ENVIRONMENT and SETUP

I did test the below solution with

HDP 2.5.0.0-1245 and Ambari 2.4.0.1
HDP 2.5.3.0-37 and Ambari 2.4.2.0

Shiro.ini

[users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
admin = password, admin
maria_dev = password, admin
user1 = password, role1
user2 = password, role2
user3 = password, admin

[main]
#activeDirectoryRealm = org.apache.zeppelin.server.ActiveDirectoryGroupRealm
#activeDirectoryRealm.systemUsername = CN=Administrator,CN=Users,DC=HW,DC=EXAMPLE,DC=COM
#activeDirectoryRealm.systemPassword = Password1!
#activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://user/zeppelin/zeppelin.jceks
#activeDirectoryRealm.searchBase = CN=Users,DC=HW,DC=TEST,DC=COM
#activeDirectoryRealm.url = ldap://ad-nano.test.example.com:389
#activeDirectoryRealm.groupRolesMap = ""
#activeDirectoryRealm.authorizationCachingEnabled = true
#ldapRealm = org.apache.shiro.realm.ldap.JndiLdapRealm
#ldapRealm.userDnTemplate = uid={0},cn=users,cn=accounts,dc=example,dc=com
#ldapRealm.contextFactory.url = ldap://ldaphost:389
#ldapRealm.contextFactory.authenticationMechanism = SIMPLE
sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
#securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login

[urls]
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
# To enfore security, comment the line below and uncomment the next one
#/api/version = anon
#/** = anon
/api/interpreter/** = authc, roles[admin]
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]
/** = authc

Configure JDBC interpreter for HIVE as:

- Zeppelin UI -> Interpreter -> JDBC -> hive.url use URL from Ambari -> Hive -> HiveServer2 JDBC URL like

jdbc:hive2://dkhdp253.dk:2181,dkhdp252.dk:2181,dkhdp251.dk:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

- "User Impersonate" under JDBC interpreter is to be unchecked

- In Hive config - ensure hive.server2.enable.doAs is set to TRUE

Dependencies in JDBC interpreter

- org.apache.hive:hive-jdbc:2.0.1

- org.apache.hadoop:hadoop-common:2.7.2

- org.apache.hive.shims:hive-shims-0.23:2.1.0

PROBLEM

When initially running a query through %jdbc(hive) I am getting

org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilege of zeppelin for user3
at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess(HiveAuthFactory.java:396)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getProxyUser(ThriftCLIService.java:751)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getUserName(ThriftCLIService.java:386)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:413)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1257)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1242)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:562)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: zeppelin is not allowed to impersonate user3
at org.apache.hadoop.security.authorize.DefaultImpersonationProvider.authorize(DefaultImpersonationProvider.java:119)
at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:102)
at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:116)
at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess(HiveAuthFactory.java:392)
	... 13 more

The fix is to add the following lines into HDFS Service -> Configs -> "Custom core-site"

hadoop.proxyuser.zeppelin.hosts=*
hadoop.proxyuser.zeppelin.groups=*

Next running a query in JDBC interpreter for i.e. hive as “user3” this returns the following in hiveserver2.log:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=user3, access=WRITE, inode="/user/user3":hdfs:hdfs:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794)
        at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4011)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1102)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:630)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)

SOLUTION

As the first step to sort out the problem I create a dedicated user’s folder in HDFS

[root@dkhdp252 hive]# hdfs dfs -mkdir /user/user3
[root@dkhdp252 hive]# hdfs dfs -chown user3:hdfs /user/user3
[root@dkhdp252 hive]# hdfs dfs -chmod 755 /user/user3
[root@dkhdp252 hive]# hdfs dfs -ls /user
Found 12 items
drwxr-xr-x   - admin               hdfs          0 2016-12-10 07:49 /user/admin
drwxrwx---   - ambari-qa           hdfs          0 2017-01-30 15:32 /user/ambari-qa
drwxr-xr-x   - hcat                hdfs          0 2016-11-29 09:25 /user/hcat
drwxr-xr-x   - hdfs                hdfs          0 2016-12-06 08:04 /user/hdfs
drwxr-xr-x   - hive                hdfs          0 2017-02-06 14:23 /user/hive
drwxrwxr-x   - livy                hdfs          0 2016-11-29 09:52 /user/livy
drwxr-xr-x   - maria_dev           hdfs          0 2017-02-07 15:53 /user/maria_dev
drwxrwxr-x   - oozie               hdfs          0 2016-12-09 16:05 /user/oozie
drwxrwxr-x   - spark               hdfs          0 2016-11-29 16:30 /user/spark
drwxr-xr-x   - user3               hdfs          0 2017-02-07 16:01 /user/user3
drwxr-xr-x   - zeppelin            hdfs          0 2016-11-29 16:17 /user/zeppelin

Next Restart JDBC interpreter

Now, when running the same query again I can see the job starts up in RM UI however checking out the application log I can see:

Application application_1486481563532_0002 failed 2 times due to AM Container for appattempt_1486481563532_0002_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://dkhdp253.dk:8088/cluster/app/application_1486481563532_0002 Then click on links to logs of each attempt.
Diagnostics: Application application_1486481563532_0002 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is user3
main : requested yarn user is user3
User user3 not found
Failing this attempt. Failing the application.

The next step is to create the “user3” in all the worker nodes like

$ adduser user3

Restart JDBC interpreter

Now, re-running the query I can see the process runs and completes successfully

Cloudera Community

Community Articles

How to enable user impersonation for HIVE interpreter in Zeppelin

Apache Hive

Apache Zeppelin

How to enable user impersonation for SH interprete...

How to enable user impersonation for SH interprete...

How to enable user impersonation for JDBC interpre...

Enabling the Zeppelin Elasticsearch interpreter

Secured access to Hive using Zeppelin's jdbc(hive)...

Zeppelin jdbc(Phoenix) interpreter example

How to enable Basic Authentication for Zeppelin UI...

User impersonation in Zeppelin

Zeppelin user impersonation

Python interpreter in Zeppelin