Created on 02-07-2017 04:31 PM
ENVIRONMENT and SETUP
I did test the below solution with
Shiro.ini
[users] # List of users with their password allowed to access Zeppelin. # To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections admin = password, admin maria_dev = password, admin user1 = password, role1 user2 = password, role2 user3 = password, admin [main] #activeDirectoryRealm = org.apache.zeppelin.server.ActiveDirectoryGroupRealm #activeDirectoryRealm.systemUsername = CN=Administrator,CN=Users,DC=HW,DC=EXAMPLE,DC=COM #activeDirectoryRealm.systemPassword = Password1! #activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://user/zeppelin/zeppelin.jceks #activeDirectoryRealm.searchBase = CN=Users,DC=HW,DC=TEST,DC=COM #activeDirectoryRealm.url = ldap://ad-nano.test.example.com:389 #activeDirectoryRealm.groupRolesMap = "" #activeDirectoryRealm.authorizationCachingEnabled = true #ldapRealm = org.apache.shiro.realm.ldap.JndiLdapRealm #ldapRealm.userDnTemplate = uid={0},cn=users,cn=accounts,dc=example,dc=com #ldapRealm.contextFactory.url = ldap://ldaphost:389 #ldapRealm.contextFactory.authenticationMechanism = SIMPLE sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager securityManager.sessionManager = $sessionManager # 86,400,000 milliseconds = 24 hour #securityManager.sessionManager.globalSessionTimeout = 86400000 shiro.loginUrl = /api/login [urls] # anon means the access is anonymous. # authcBasic means Basic Auth Security # To enfore security, comment the line below and uncomment the next one #/api/version = anon #/** = anon /api/interpreter/** = authc, roles[admin] /api/configurations/** = authc, roles[admin] /api/credential/** = authc, roles[admin] /** = authc
Configure JDBC interpreter for HIVE as:
- Zeppelin UI -> Interpreter -> JDBC -> hive.url use URL from Ambari -> Hive -> HiveServer2 JDBC URL like
jdbc:hive2://dkhdp253.dk:2181,dkhdp252.dk:2181,dkhdp251.dk:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
- "User Impersonate" under JDBC interpreter is to be unchecked
- In Hive config - ensure hive.server2.enable.doAs is set to TRUE
Dependencies in JDBC interpreter
- org.apache.hive:hive-jdbc:2.0.1
- org.apache.hadoop:hadoop-common:2.7.2
- org.apache.hive.shims:hive-shims-0.23:2.1.0
PROBLEM
When initially running a query through %jdbc(hive) I am getting
org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilege of zeppelin for user3 at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess(HiveAuthFactory.java:396) at org.apache.hive.service.cli.thrift.ThriftCLIService.getProxyUser(ThriftCLIService.java:751) at org.apache.hive.service.cli.thrift.ThriftCLIService.getUserName(ThriftCLIService.java:386) at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:413) at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1257) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1242) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:562) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: zeppelin is not allowed to impersonate user3 at org.apache.hadoop.security.authorize.DefaultImpersonationProvider.authorize(DefaultImpersonationProvider.java:119) at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:102) at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:116) at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess(HiveAuthFactory.java:392) ... 13 more
The fix is to add the following lines into HDFS Service -> Configs -> "Custom core-site"
hadoop.proxyuser.zeppelin.hosts=* hadoop.proxyuser.zeppelin.groups=*
Next running a query in JDBC interpreter for i.e. hive as “user3” this returns the following in hiveserver2.log:
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=user3, access=WRITE, inode="/user/user3":hdfs:hdfs:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4011) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1102) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:630) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
SOLUTION
As the first step to sort out the problem I create a dedicated user’s folder in HDFS
[root@dkhdp252 hive]# hdfs dfs -mkdir /user/user3 [root@dkhdp252 hive]# hdfs dfs -chown user3:hdfs /user/user3 [root@dkhdp252 hive]# hdfs dfs -chmod 755 /user/user3 [root@dkhdp252 hive]# hdfs dfs -ls /user Found 12 items drwxr-xr-x - admin hdfs 0 2016-12-10 07:49 /user/admin drwxrwx--- - ambari-qa hdfs 0 2017-01-30 15:32 /user/ambari-qa drwxr-xr-x - hcat hdfs 0 2016-11-29 09:25 /user/hcat drwxr-xr-x - hdfs hdfs 0 2016-12-06 08:04 /user/hdfs drwxr-xr-x - hive hdfs 0 2017-02-06 14:23 /user/hive drwxrwxr-x - livy hdfs 0 2016-11-29 09:52 /user/livy drwxr-xr-x - maria_dev hdfs 0 2017-02-07 15:53 /user/maria_dev drwxrwxr-x - oozie hdfs 0 2016-12-09 16:05 /user/oozie drwxrwxr-x - spark hdfs 0 2016-11-29 16:30 /user/spark drwxr-xr-x - user3 hdfs 0 2017-02-07 16:01 /user/user3 drwxr-xr-x - zeppelin hdfs 0 2016-11-29 16:17 /user/zeppelin
Next Restart JDBC interpreter
Now, when running the same query again I can see the job starts up in RM UI however checking out the application log I can see:
Application application_1486481563532_0002 failed 2 times due to AM Container for appattempt_1486481563532_0002_000002 exited with exitCode: -1000 For more detailed output, check the application tracking page: http://dkhdp253.dk:8088/cluster/app/application_1486481563532_0002 Then click on links to logs of each attempt. Diagnostics: Application application_1486481563532_0002 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is user3 main : requested yarn user is user3 User user3 not found Failing this attempt. Failing the application.
The next step is to create the “user3” in all the worker nodes like
$ adduser user3
Restart JDBC interpreter
Now, re-running the query I can see the process runs and completes successfully