Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Knox impersonation issue

avatar
Contributor

I have following use case:

Application connecting to Knox gateway and trying to run hive source -> hive target. The way it is transformed is Knox connects to Hive Service and submit the request.

I have user guest created for Knox access and created as Unix user also. I am trying to impersonate user adapqa while submitting job via Knox.

I am getting following error in hiveserver2 log.

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive is not allowed to impersonate adpqa at org.apache.hadoop.ipc.Client.call(Client.java:1427) at org.apache.hadoop.ipc.Client.call(Client.java:1358) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(FileUtils.java:757) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(StorageBasedAuthorizationProvider.java:364) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(StorageBasedAuthorizationProvider.java:339) ... 74 more 2016-02-12 12:38:17,578 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(127)) - Could not validate cookie sent, will try to generate a new cookie 2016-02-12 12:38:17,578 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(169)) - Cookie added for clientUserName anonymous 2016-02-12 12:38:17,578 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(294)) - Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8 2016-02-12 12:38:17,580 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: metastore.ObjectStore (ObjectStore.java:initialize(290)) - ObjectStore, initialize called 2016-02-12 12:38:18,261 WARN [HiveServer2-HttpHandler-Pool: Thread-36]: conf.HiveConf (HiveConf.java:initialize(2774)) - HiveConf of name hive.server2.enable.impersonation does not exist 2016-02-12 12:38:18,262 INFO [HiveServer2-HttpHandler-Pool: Thread-36]: metastore.ObjectStore (ObjectStore.java:getPMF(375)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Database,Type,FieldSchema,Order"

I have made sure that adpqa does exist as a user both on unix and hdfs.

adpqa@ivlhdp61:/var/log/hive> hadoop fs -ls /user

Found 5 items drwxr-xr-x - adpqa supergroup 0 2016-02-12 11:54 /user/adpqa

Both hive and adpqa are part of users group

adpqa@ivlhdp61:/var/log/hive> groups hive

hive : hadoop users

adpqa@ivlhdp61:/var/log/hive> groups adpqa

adpqa : users dialout video hadoop

Following is HDFS configuration on cluster.

2092-config1.png

2093-config2.png

I am unable to understand why do we get error "User: hive is not allowed to impersonate adpqa".

Is that some more configuration missing?

1 ACCEPTED SOLUTION

avatar
Contributor

With the new cluster setup, we do not see this issue anymore. I believe issue was due to improper configuration.

View solution in original post

14 REPLIES 14

avatar
Master Guru

Hi @Vishal Shah, you also need, in Hive-->Configs:

webhcat.proxyuser.knox.groups=*
webhcat.proxyuser.knox.hosts=*
hive.server2.allow.user.substitution=true

Try knox.groups and hosts first with "*" and if it works reduce permissions to for example "users" and your KNOX host FQDN. Full manual here, scroll down to the Hive section.

avatar
Contributor

Hi Predrag,

We tried this as well. But issue still exist.

It is strange that using beeline with same jdbc connection string i am able to execute queries successfully.

But when running from an application it does not work.

avatar

I didn't understand that beeline was working via Knox already. A few questions then:

  1. What application is making the HS2 call via Knox?
  2. Is the application using JDBC or ODBC drivers and what version?
  3. What does your JDBC connect string look like (without real hostname or passwords of course)?

avatar
Contributor

Hi Kevin,

Thanks for the reply. I was away for a while couldn't follow up on the issue.

With the new cluster setup, we do not see this issue anymore. I believe issue was due to improper configuration.

avatar
Contributor

With the new cluster setup, we do not see this issue anymore. I believe issue was due to improper configuration.