Created on 08-28-2018 03:09 AM - edited 09-16-2022 06:38 AM
Hiya,
I am trying to balance the data disks on a few of our DataNodes. The cluster is Kerberos-enabled and uses Sentry. I get a permission denied error while trying to create a plan with the diskbalancer CLI tool. I don't understand why this is happening and would appreciate some help.
root@head01:~# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: hdfs@<OUR_REALM> Valid starting Expires Service principal 08/28/2018 08:54:46 08/28/2018 18:54:46 krbtgt/<OUR_REALM>@<OUR_REALM> renew until 08/29/2018 08:54:39
root@head01:~# hdfs diskbalancer -plan node13.<our_fqdn> 18/08/28 09:25:52 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec 18/08/28 09:25:52 INFO block.BlockTokenSecretManager: Setting block keys 18/08/28 09:25:52 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec 18/08/28 09:25:53 ERROR tools.DiskBalancerCLI: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSuperuserPrivilege(DataNode.java:986) at org.apache.hadoop.hdfs.server.datanode.DataNode.getDiskBalancerSetting(DataNode.java:3245) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getDiskBalancerSetting(ClientDatanodeProtocolServerSideTranslatorPB.java:361) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:17901) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
The error in node13's DATANODE logs is similar:
2018-08-28 09:25:38,529 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs@<OUR_REALM> (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied.
2018-08-28 09:25:53,157 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020, call org.apache.hadoop.hdfs.protocol.ClientDatanodeProtocol.getDiskBalancerSetting from <node13_address>:58529 Call#9 Retry#0 org.apache.hadoop.security.AccessControlException: Permission denied. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSuperuserPrivilege(DataNode.java:986) at org.apache.hadoop.hdfs.server.datanode.DataNode.getDiskBalancerSetting(DataNode.java:3245) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getDiskBalancerSetting(ClientDatanodeProtocolServerSideTranslatorPB.java:361) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:17901) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
Note that "hdfs diskbalancer -report" gives the expected output (i.e., no errors). If I switch to a non-hdfs principal it does give the expected error. So I'm a bit puzzled where the permission issue kicks in.
Created 08-28-2018 10:58 AM
Created 08-28-2018 10:46 AM
It may possible if your JAVA_HOME is not referring to the right path.
export JAVA_HOME=<the right path -or- (usually /usr/java)>
please check java path at node13 and set the right path and try again, it may help you
Created 08-28-2018 10:58 AM
Created 08-29-2018 03:18 AM
Thank you kindly weichiu, that did the trick.
Just briefly, for people with the same problem, I had to:
1. SSH to the node in question 2. cd /var/run/cloudera-scm-agent/process && find . -type f -iname "hdfs.keytab" n.b.: it will probably be under <pid>-hdfs-DATANODE/hdfs.keytab 3. Use the keytab to get a ticket kinit -k -t ./<pid>-hdfs-DATANODE/hdfs.keytab -p hdfs/<NODE_FQDN>@<OUR_REALM>
4. Proceed with the diskbalancer plan/execution
This in contrast to hdfs@<OUR_REALM>, which is a principal that was created manually (and is accepted for many superuser hdfs commands).
Created 08-29-2018 03:29 AM
I can give you two easy steps , it may reduce your burden
1. To list the valid kerberos principal $ cd /var/run/cloudera-scm-agent/process/<pid>-hdfs-DATANODE $ klist -kt hdfs.keytab ## The klist command will list the valid kerbros principal in the following format "hdfs/<NODE_FQDN>@<OUR_REALM>" 2. to kinit with the aboev listed full path $ kinit -kt hdfs.keytab <copy paste the any one of the hdfs principal from the above klist>