Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar

ENVIRONMENT and SETUP

I did test the below solution with

  • HDP 2.5.0.0-1245 and Ambari 2.4.0.1
  • HDP 2.5.3.0-37 and Ambari 2.4.2.0

Shiro.ini

[users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
admin = password, admin
maria_dev = password, admin
user1 = password, role1
user2 = password, role2
user3 = password, admin

[main]
#activeDirectoryRealm = org.apache.zeppelin.server.ActiveDirectoryGroupRealm
#activeDirectoryRealm.systemUsername = CN=Administrator,CN=Users,DC=HW,DC=EXAMPLE,DC=COM
#activeDirectoryRealm.systemPassword = Password1!
#activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://user/zeppelin/zeppelin.jceks
#activeDirectoryRealm.searchBase = CN=Users,DC=HW,DC=TEST,DC=COM
#activeDirectoryRealm.url = ldap://ad-nano.test.example.com:389
#activeDirectoryRealm.groupRolesMap = ""
#activeDirectoryRealm.authorizationCachingEnabled = true
#ldapRealm = org.apache.shiro.realm.ldap.JndiLdapRealm
#ldapRealm.userDnTemplate = uid={0},cn=users,cn=accounts,dc=example,dc=com
#ldapRealm.contextFactory.url = ldap://ldaphost:389
#ldapRealm.contextFactory.authenticationMechanism = SIMPLE
sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
#securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login

[urls]
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
# To enfore security, comment the line below and uncomment the next one
#/api/version = anon
#/** = anon
/api/interpreter/** = authc, roles[admin]
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]
/** = authc

Configure JDBC interpreter for HIVE as:

- Zeppelin UI -> Interpreter -> JDBC -> hive.url use URL from Ambari -> Hive -> HiveServer2 JDBC URL like

jdbc:hive2://dkhdp253.dk:2181,dkhdp252.dk:2181,dkhdp251.dk:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

- "User Impersonate" under JDBC interpreter is to be unchecked

- In Hive config - ensure hive.server2.enable.doAs is set to TRUE

Dependencies in JDBC interpreter

- org.apache.hive:hive-jdbc:2.0.1

- org.apache.hadoop:hadoop-common:2.7.2

- org.apache.hive.shims:hive-shims-0.23:2.1.0

PROBLEM

When initially running a query through %jdbc(hive) I am getting

org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilege of zeppelin for user3
at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess(HiveAuthFactory.java:396)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getProxyUser(ThriftCLIService.java:751)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getUserName(ThriftCLIService.java:386)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:413)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1257)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1242)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:562)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: zeppelin is not allowed to impersonate user3
at org.apache.hadoop.security.authorize.DefaultImpersonationProvider.authorize(DefaultImpersonationProvider.java:119)
at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:102)
at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:116)
at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess(HiveAuthFactory.java:392)
	... 13 more

The fix is to add the following lines into HDFS Service -> Configs -> "Custom core-site"

hadoop.proxyuser.zeppelin.hosts=*
hadoop.proxyuser.zeppelin.groups=*

Next running a query in JDBC interpreter for i.e. hive as “user3” this returns the following in hiveserver2.log:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=user3, access=WRITE, inode="/user/user3":hdfs:hdfs:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794)
        at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4011)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1102)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:630)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)

SOLUTION

As the first step to sort out the problem I create a dedicated user’s folder in HDFS

[root@dkhdp252 hive]# hdfs dfs -mkdir /user/user3
[root@dkhdp252 hive]# hdfs dfs -chown user3:hdfs /user/user3
[root@dkhdp252 hive]# hdfs dfs -chmod 755 /user/user3
[root@dkhdp252 hive]# hdfs dfs -ls /user
Found 12 items
drwxr-xr-x   - admin               hdfs          0 2016-12-10 07:49 /user/admin
drwxrwx---   - ambari-qa           hdfs          0 2017-01-30 15:32 /user/ambari-qa
drwxr-xr-x   - hcat                hdfs          0 2016-11-29 09:25 /user/hcat
drwxr-xr-x   - hdfs                hdfs          0 2016-12-06 08:04 /user/hdfs
drwxr-xr-x   - hive                hdfs          0 2017-02-06 14:23 /user/hive
drwxrwxr-x   - livy                hdfs          0 2016-11-29 09:52 /user/livy
drwxr-xr-x   - maria_dev           hdfs          0 2017-02-07 15:53 /user/maria_dev
drwxrwxr-x   - oozie               hdfs          0 2016-12-09 16:05 /user/oozie
drwxrwxr-x   - spark               hdfs          0 2016-11-29 16:30 /user/spark
drwxr-xr-x   - user3               hdfs          0 2017-02-07 16:01 /user/user3
drwxr-xr-x   - zeppelin            hdfs          0 2016-11-29 16:17 /user/zeppelin

Next Restart JDBC interpreter

Now, when running the same query again I can see the job starts up in RM UI however checking out the application log I can see:

Application application_1486481563532_0002 failed 2 times due to AM Container for appattempt_1486481563532_0002_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://dkhdp253.dk:8088/cluster/app/application_1486481563532_0002 Then click on links to logs of each attempt.
Diagnostics: Application application_1486481563532_0002 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is user3
main : requested yarn user is user3
User user3 not found
Failing this attempt. Failing the application.

The next step is to create the “user3” in all the worker nodes like

$ adduser user3

Restart JDBC interpreter

Now, re-running the query I can see the process runs and completes successfully

5,729 Views