Support Questions

Find answers, ask questions, and share your expertise

Hive/Druid query 'permission denied' from TEZ/hadoop

avatar
Contributor

Hi all, I have recently setup a new druid/hive stack (and yes I am new to both and to hadoop/tez, etc) (I have used the user admin accross the entire stack for the entire setup via HDP 3.0)

I have been able to run some basic queries from hive without an issue, such as:

select `__time`,accepteddate,accountid from sms_messages group by accountid, accepteddate, `__time` limit 100;

works without issue, but if run (from squirrel)

select max(`__time`),accepteddate,accountid from sms_messages group by accountid, accepteddate;

(note: this same query works when run on druid directly)

it fails with:

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
SQLState: 08S01
ErrorCode: 1

From beeline (we more details):

WARN  : The session: sessionId=d797cf5f-3b33-49bb-aed9-c57f0e6a30c0, queueName=null, user=admin, doAs=true, isOpen=false, isDefault=false has not been openedINFO  : Subscribed to counters: [] for queryId: hive_20190213215439_c3cfa7e9-ea7c-4dc2-930b-dc8d1cee41fcINFO  : Tez session hasn't been created yet. Opening sessionERROR : Failed to execute tez graph.org.apache.hadoop.security.AccessControlException: Permission denied: user=admin, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x

FULL STACK DETAILS BELOW

0: jdbc:hive2://wdc-tst-bdrd-001.openmarket.c> select max(`__time`),accepteddate,accountid from sms_messages group by accountid, accepteddate;INFO  : Compiling command(queryId=hive_20190213215031_4b3f0067-f772-4705-8f4e-4da82e1fa8f4): select max(`__time`),accepteddate,accountid from sms_messages group by accountid, accepteddateINFO  : Semantic Analysis Completed (retrial = false)INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:timestamp with local time zone, comment:null), FieldSchema(name:accepteddate, type:string, comment:null), FieldSchema(name:accountid, type:string, comment:null)], properties:null)INFO  : Completed compiling command(queryId=hive_20190213215031_4b3f0067-f772-4705-8f4e-4da82e1fa8f4); Time taken: 0.119 secondsINFO  : Executing command(queryId=hive_20190213215031_4b3f0067-f772-4705-8f4e-4da82e1fa8f4): select max(`__time`),accepteddate,accountid from sms_messages group by accountid, accepteddateINFO  : Query ID = hive_20190213215031_4b3f0067-f772-4705-8f4e-4da82e1fa8f4INFO  : Total jobs = 1INFO  : Launching Job 1 out of 1INFO  : Starting task [Stage-1:MAPRED] in serial modeWARN  : The session: sessionId=3b91cc44-16a0-46c0-8c27-6d2b15697ba7, queueName=null, user=admin, doAs=true, isOpen=false, isDefault=false has not been openedINFO  : Subscribed to counters: [] for queryId: hive_20190213215031_4b3f0067-f772-4705-8f4e-4da82e1fa8f4INFO  : Tez session hasn't been created yet. Opening sessionERROR : Failed to execute tez graph.org.apache.hadoop.security.AccessControlException: Permission denied: user=admin, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)   at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1857)   at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1841)   at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1800)   at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3150)   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1126)   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:707)   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)   at java.security.AccessController.doPrivileged(Native Method)   at javax.security.auth.Subject.doAs(Subject.java:422)   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_112]   at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_112]   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_112]   at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_112]   at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2417) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2391) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1325) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1322) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1339) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1314) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2275) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getDefaultDestDir(DagUtils.java:1001) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getHiveJarDirectory(DagUtils.java:1153) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createJarLocalResource(TezSessionState.java:896) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.makeCombinedJarMap(TezSessionState.java:349) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:418) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:373) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:372) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:199) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2711) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2382) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2054) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1752) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]   at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112]   at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112]   at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_112]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_112]   at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]Caused by: org.apache.hadoop.ipc.RemoteException: Permission denied: user=admin, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)   at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1857)   at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1841)   at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1800)   at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3150)   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1126)   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:707)   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)   at java.security.AccessController.doPrivileged(Native Method)   at javax.security.auth.Subject.doAs(Subject.java:422)   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.ipc.Client.call(Client.java:1353) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at com.sun.proxy.$Proxy32.mkdirs(Unknown Source) ~[?:?]   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:653) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) ~[?:?]   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]   at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]   at com.sun.proxy.$Proxy33.mkdirs(Unknown Source) ~[?:?]   at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2415) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]   ... 38 moreERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTaskINFO  : Completed executing command(queryId=hive_20190213215031_4b3f0067-f772-4705-8f4e-4da82e1fa8f4); Time taken: 0.533 secondsError: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Dan Hops

As we see the following error:

Permission denied: user=admin, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x

.

Which means you are trying to run the job as 'admin' user. Hence please make sure that the "/user/admin" directory is created to the HDFS using supreuser and then you can run your jobs.

So please do the following:

# su - hdfs -c "hdfs dfs -mkdir /user/admin"
# su - hdfs -c "hdfs dfs -chown -R admin:hdfs /user/admin"
# su - hdfs -c "hdfs dfs -chmod 755 /user/admin"

.

Then try running your jobs.

.

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@Dan Hops

As we see the following error:

Permission denied: user=admin, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x

.

Which means you are trying to run the job as 'admin' user. Hence please make sure that the "/user/admin" directory is created to the HDFS using supreuser and then you can run your jobs.

So please do the following:

# su - hdfs -c "hdfs dfs -mkdir /user/admin"
# su - hdfs -c "hdfs dfs -chown -R admin:hdfs /user/admin"
# su - hdfs -c "hdfs dfs -chmod 755 /user/admin"

.

Then try running your jobs.

.

avatar
Contributor

Indeed thank you, that solved the issue, I was just surprised that I needed to do all that, thought ambari would have set that up from the beginning.

A follow up question..

When I run the query on druid directly, it returns in like 2 seconds..when I run the query on hive..it converts to map-reduce and takes 2+ minutes to run..any thoughts on why? when running it through hive..I'm guessing the query is not being passed down..and rather all the data is being streamed back to hive?