Created on 02-07-2017 04:31 PM
I did test the below solution with
[users] # List of users with their password allowed to access Zeppelin. # To use a different strategy (LDAP / Database / ...) check the shiro doc at admin = password, admin maria_dev = password, admin user1 = password, role1 user2 = password, role2 user3 = password, admin [main] #activeDirectoryRealm = org.apache.zeppelin.server.ActiveDirectoryGroupRealm #activeDirectoryRealm.systemUsername = CN=Administrator,CN=Users,DC=HW,DC=EXAMPLE,DC=COM #activeDirectoryRealm.systemPassword = Password1! #activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://user/zeppelin/zeppelin.jceks #activeDirectoryRealm.searchBase = CN=Users,DC=HW,DC=TEST,DC=COM #activeDirectoryRealm.url = ldap:// #activeDirectoryRealm.groupRolesMap = "" #activeDirectoryRealm.authorizationCachingEnabled = true #ldapRealm = org.apache.shiro.realm.ldap.JndiLdapRealm #ldapRealm.userDnTemplate = uid={0},cn=users,cn=accounts,dc=example,dc=com #ldapRealm.contextFactory.url = ldap://ldaphost:389 #ldapRealm.contextFactory.authenticationMechanism = SIMPLE sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager securityManager.sessionManager = $sessionManager # 86,400,000 milliseconds = 24 hour #securityManager.sessionManager.globalSessionTimeout = 86400000 shiro.loginUrl = /api/login [urls] # anon means the access is anonymous. # authcBasic means Basic Auth Security # To enfore security, comment the line below and uncomment the next one #/api/version = anon #/** = anon /api/interpreter/** = authc, roles[admin] /api/configurations/** = authc, roles[admin] /api/credential/** = authc, roles[admin] /** = authc
Configure JDBC interpreter for HIVE as:
- Zeppelin UI -> Interpreter -> JDBC -> hive.url use URL from Ambari -> Hive -> HiveServer2 JDBC URL like
- "User Impersonate" under JDBC interpreter is to be unchecked
- In Hive config - ensure hive.server2.enable.doAs is set to TRUE
Dependencies in JDBC interpreter
- org.apache.hive:hive-jdbc:2.0.1
- org.apache.hadoop:hadoop-common:2.7.2
- org.apache.hive.shims:hive-shims-0.23:2.1.0
When initially running a query through %jdbc(hive) I am getting
org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilege of zeppelin for user3 at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess( at org.apache.hive.service.cli.thrift.ThriftCLIService.getProxyUser( at org.apache.hive.service.cli.thrift.ThriftCLIService.getUserName( at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle( at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession( at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult( at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult( at org.apache.thrift.ProcessFunction.process( at org.apache.thrift.TBaseProcessor.process( at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process( at org.apache.thrift.server.TThreadPoolServer$ at java.util.concurrent.ThreadPoolExecutor.runWorker( at java.util.concurrent.ThreadPoolExecutor$ at Caused by: User: zeppelin is not allowed to impersonate user3 at at at at org.apache.hive.service.auth.HiveAuthFactory.verifyProxyAccess( ... 13 more
The fix is to add the following lines into HDFS Service -> Configs -> "Custom core-site"
hadoop.proxyuser.zeppelin.hosts=* hadoop.proxyuser.zeppelin.groups=*
Next running a query in JDBC interpreter for i.e. hive as “user3” this returns the following in hiveserver2.log:
Caused by: org.apache.hadoop.ipc.RemoteException( Permission denied: user=user3, access=WRITE, inode="/user/user3":hdfs:hdfs:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check( at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check( at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission( at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission( at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission( at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission( at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess( at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs( at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs( at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs( at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs( at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ at org.apache.hadoop.ipc.RPC$ at org.apache.hadoop.ipc.Server$Handler$ at org.apache.hadoop.ipc.Server$Handler$ at Method) at at
As the first step to sort out the problem I create a dedicated user’s folder in HDFS
[root@dkhdp252 hive]# hdfs dfs -mkdir /user/user3 [root@dkhdp252 hive]# hdfs dfs -chown user3:hdfs /user/user3 [root@dkhdp252 hive]# hdfs dfs -chmod 755 /user/user3 [root@dkhdp252 hive]# hdfs dfs -ls /user Found 12 items drwxr-xr-x - admin hdfs 0 2016-12-10 07:49 /user/admin drwxrwx--- - ambari-qa hdfs 0 2017-01-30 15:32 /user/ambari-qa drwxr-xr-x - hcat hdfs 0 2016-11-29 09:25 /user/hcat drwxr-xr-x - hdfs hdfs 0 2016-12-06 08:04 /user/hdfs drwxr-xr-x - hive hdfs 0 2017-02-06 14:23 /user/hive drwxrwxr-x - livy hdfs 0 2016-11-29 09:52 /user/livy drwxr-xr-x - maria_dev hdfs 0 2017-02-07 15:53 /user/maria_dev drwxrwxr-x - oozie hdfs 0 2016-12-09 16:05 /user/oozie drwxrwxr-x - spark hdfs 0 2016-11-29 16:30 /user/spark drwxr-xr-x - user3 hdfs 0 2017-02-07 16:01 /user/user3 drwxr-xr-x - zeppelin hdfs 0 2016-11-29 16:17 /user/zeppelin
Next Restart JDBC interpreter
Now, when running the same query again I can see the job starts up in RM UI however checking out the application log I can see:
Application application_1486481563532_0002 failed 2 times due to AM Container for appattempt_1486481563532_0002_000002 exited with exitCode: -1000 For more detailed output, check the application tracking page: Then click on links to logs of each attempt. Diagnostics: Application application_1486481563532_0002 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is user3 main : requested yarn user is user3 User user3 not found Failing this attempt. Failing the application.
The next step is to create the “user3” in all the worker nodes like
$ adduser user3
Restart JDBC interpreter
Now, re-running the query I can see the process runs and completes successfully