Created 07-19-2016 06:59 PM
Hi, we're kerberizing our HDP cluster. As part of that process, we kerberized our QA cluster and testing all our oozie workflows in kerberos environment. We were able to run java and hive actions successfully, but are stuck with shell actions where we run a hive query inside a shell action. We've tried multiple approaches. But, none of them works.
Here is what we tried:
Added "credentials" section in our workflow similar to what we do for hive actions.
Doing kinit inside the shell script before launching hive CLI.
We get this error for both the approaches:
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:6744) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:628) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:987) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:507) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:680) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:6744) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:628) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:987) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) at org.apache.hadoop.ipc.Client.call(Client.java:1427) at org.apache.hadoop.ipc.Client.call(Client.java:1358) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy14.getDelegationToken(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:933) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy15.getDelegationToken(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:1043) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1552) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:530) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:508) at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2238) at org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(TokenCache.java:107) at org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(TokenCache.java:86) at org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(TokenCache.java:76) at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:200) at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:845) at org.apache.tez.client.TezClient.start(TezClient.java:380) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:117) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) ... 8 more
Followed this blog using-oozie-in-kerberized-cluster. Please refer to bullet point #7 in this blog.
Please advise. Thanks much.
Created 07-29-2016 05:28 AM
Created 07-19-2016 07:08 PM
@Sunile Manjee thanks much for your response.
We did try "kinit" inside the shell script. Still got the same error. But, the only difference here is that we do not have keytab on HDFS. We pushed the keytab to all the cluster nodes and it is available on local file system.
kinit foo@TEST.COM -k -t /etc/security/keytabs/foo.headless.keytab
Created 07-19-2016 07:08 PM
@Gopichand Mummineni this is how I would do it. I got much of this from @Benjamin Leonhardi feedback
Shell Action
Created 07-19-2016 07:18 PM
@Gopichand Mummineni my understand it has to be run as oozie and not foo@*. @Benjamin Leonhardi please confirm or correct this understanding.
Created 07-27-2016 12:09 AM
@Gopichand Mummineni - I got it working. I will write a blog and update you shortly
Created 07-29-2016 05:28 AM
I got this working
Please refer https://community.hortonworks.com/content/kbentry/48132/oozie-shell-action-run-hive-query-in-shell-s...
Created 07-29-2016 04:14 PM
Thanks for the response @Kuldeep Kulkarni.
I followed the exact same approach you explained in your blog. The only difference was that we had tez as our execution engine for hive.
Anyhow, I tried this first and it didn't work. It is because I believe that I am changing the execution engine to mr with in the hive CLI which is late because hive attempts to launch Tez AppMaster when we attempt to launch the CLI itself even before running the query.
hive -e "SET hive.execution.engine=mr; SET mapreduce.job.credentials.binary=${HADOOP_TOKEN_FILE_LOCATION}; select MAX(update_time) from test_db.test_table;" -S
Then, I changed my command to include hiveconf so that I am changing my execution engine to mr even before Hive attempts to launch the Tez AppMaster. This one worked!!!
hive -e "SET mapreduce.job.credentials.binary=${HADOOP_TOKEN_FILE_LOCATION}; select MAX(update_time) from test_db.test_table;" -S --hiveconf hive.execution.engine=mr
I also tried other options as well which are listed below. These didn't work either.
hive -e "select MAX(update_time) from test_db.test_table;" -S --hiveconf tez.credentials.path=${HADOOP_TOKEN_FILE_LOCATION}
hive -e "SET tez.credentials.path=${HADOOP_TOKEN_FILE_LOCATION}; select MAX(update_time) from test_db.test_table;" -SI am curious to understand why passing the credentials to Tez won't work. Are you aware of any existing open apache bug for this? Understanding this better will help me in future.
Thanks again.