Created on 02-12-2016 09:52 AM - edited 08-19-2019 01:26 AM
I have following use case:
Application connecting to Knox gateway and trying to run hive source -> hive target. The way it is transformed is Knox connects to Hive Service and submit the request.
I have user guest created for Knox access and created as Unix user also. I am trying to impersonate user adapqa while submitting job via Knox.
I am getting following error in hiveserver2 log.
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive is not allowed to impersonate adpqa at org.apache.hadoop.ipc.Client.call(Client.java:1427) at org.apache.hadoop.ipc.Client.call(Client.java:1358) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(FileUtils.java:757) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(StorageBasedAuthorizationProvider.java:364) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(StorageBasedAuthorizationProvider.java:339) ... 74 more 2016-02-12 12:38:17,578 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(127)) - Could not validate cookie sent, will try to generate a new cookie 2016-02-12 12:38:17,578 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(169)) - Cookie added for clientUserName anonymous 2016-02-12 12:38:17,578 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(294)) - Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8 2016-02-12 12:38:17,580 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: metastore.ObjectStore (ObjectStore.java:initialize(290)) - ObjectStore, initialize called 2016-02-12 12:38:18,261 WARN [HiveServer2-HttpHandler-Pool: Thread-36]: conf.HiveConf (HiveConf.java:initialize(2774)) - HiveConf of name hive.server2.enable.impersonation does not exist 2016-02-12 12:38:18,262 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: metastore.ObjectStore (ObjectStore.java:getPMF(375)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Database,Type,FieldSchema,Order"
I have made sure that adpqa does exist as a user both on unix and hdfs.
adpqa@ivlhdp61:/var/log/hive> hadoop fs -ls /user
Found 5 items drwxr-xr-x - adpqa supergroup 0 2016-02-12 11:54 /user/adpqa
Both hive and adpqa are part of users group
adpqa@ivlhdp61:/var/log/hive> groups hive
hive : hadoop users
adpqa@ivlhdp61:/var/log/hive> groups adpqa
adpqa : users dialout video hadoop
Following is HDFS configuration on cluster.
I am unable to understand why do we get error "User: hive is not allowed to impersonate adpqa".
Is that some more configuration missing?
Created 02-26-2016 06:42 AM
With the new cluster setup, we do not see this issue anymore. I believe issue was due to improper configuration.
Created 02-16-2016 06:01 AM
Hi @Vishal Shah, you also need, in Hive-->Configs:
webhcat.proxyuser.knox.groups=* webhcat.proxyuser.knox.hosts=* hive.server2.allow.user.substitution=true
Try knox.groups and hosts first with "*" and if it works reduce permissions to for example "users" and your KNOX host FQDN. Full manual here, scroll down to the Hive section.
Created 02-16-2016 10:43 AM
Hi Predrag,
We tried this as well. But issue still exist.
It is strange that using beeline with same jdbc connection string i am able to execute queries successfully.
But when running from an application it does not work.
Created 02-16-2016 02:49 PM
I didn't understand that beeline was working via Knox already. A few questions then:
Created 02-26-2016 06:42 AM
Hi Kevin,
Thanks for the reply. I was away for a while couldn't follow up on the issue.
With the new cluster setup, we do not see this issue anymore. I believe issue was due to improper configuration.
Created 02-26-2016 06:42 AM
With the new cluster setup, we do not see this issue anymore. I believe issue was due to improper configuration.