Support Questions

Find answers, ask questions, and share your expertise

HDFS diskbalancer unexpected permission denied error

avatar
New Contributor

Hiya,

 

I am trying to balance the data disks on a few of our DataNodes. The cluster is Kerberos-enabled and uses Sentry. I get a permission denied error while trying to create a plan with the diskbalancer CLI tool. I don't understand why this is happening and would appreciate some help.

 

root@head01:~# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: hdfs@<OUR_REALM>

Valid starting       Expires              Service principal
08/28/2018 08:54:46  08/28/2018 18:54:46  krbtgt/<OUR_REALM>@<OUR_REALM>
        renew until 08/29/2018 08:54:39
root@head01:~# hdfs diskbalancer -plan node13.<our_fqdn> 18/08/28 09:25:52 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec 18/08/28 09:25:52 INFO block.BlockTokenSecretManager: Setting block keys 18/08/28 09:25:52 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec 18/08/28 09:25:53 ERROR tools.DiskBalancerCLI: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSuperuserPrivilege(DataNode.java:986) at org.apache.hadoop.hdfs.server.datanode.DataNode.getDiskBalancerSetting(DataNode.java:3245) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getDiskBalancerSetting(ClientDatanodeProtocolServerSideTranslatorPB.java:361) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:17901) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)

The error in node13's DATANODE logs is similar:

 

2018-08-28 09:25:38,529 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs@<OUR_REALM> (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied.
2018-08-28 09:25:53,157 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020, call org.apache.hadoop.hdfs.protocol.ClientDatanodeProtocol.getDiskBalancerSetting from <node13_address>:58529 Call#9 Retry#0 org.apache.hadoop.security.AccessControlException: Permission denied. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSuperuserPrivilege(DataNode.java:986) at org.apache.hadoop.hdfs.server.datanode.DataNode.getDiskBalancerSetting(DataNode.java:3245) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getDiskBalancerSetting(ClientDatanodeProtocolServerSideTranslatorPB.java:361) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:17901) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)

 

Note that "hdfs diskbalancer -report" gives the expected output (i.e., no errors). If I switch to a non-hdfs principal it does give the expected error. So I'm a bit puzzled where the permission issue kicks in.

1 ACCEPTED SOLUTION

avatar
Expert Contributor
You are probably on CDH5.10 or later.
Please login as hdfs/node13.<our_fqdn>@<OUR_REALM> instead of hdfs@<OUR_REALM>.

Related jira: HDFS-11069. (Tighten the authorization of datanode RPC.)

View solution in original post

4 REPLIES 4

avatar
Champion

@Matt_

 

It may possible if your JAVA_HOME is not referring to the right path.

 

export JAVA_HOME=<the right path -or- (usually /usr/java)>

 

please check java path at node13 and set the right path and try again, it may help you

 

avatar
Expert Contributor
You are probably on CDH5.10 or later.
Please login as hdfs/node13.<our_fqdn>@<OUR_REALM> instead of hdfs@<OUR_REALM>.

Related jira: HDFS-11069. (Tighten the authorization of datanode RPC.)

avatar
New Contributor

Thank you kindly weichiu, that did the trick.

 

Just briefly, for people with the same problem, I had to:

 

1. SSH to the node in question
2. cd /var/run/cloudera-scm-agent/process && find . -type f -iname "hdfs.keytab"
n.b.: it will probably be under <pid>-hdfs-DATANODE/hdfs.keytab
3. Use the keytab to get a ticket
kinit -k -t ./<pid>-hdfs-DATANODE/hdfs.keytab -p hdfs/<NODE_FQDN>@<OUR_REALM>
4. Proceed with the diskbalancer plan/execution

This in contrast to hdfs@<OUR_REALM>, which is a principal that was created manually (and is accepted for many superuser hdfs commands).

avatar
Champion

@Matt_

 

I can give you two easy steps , it may reduce your burden

 

 

1. To list the valid kerberos principal
	$ cd /var/run/cloudera-scm-agent/process/<pid>-hdfs-DATANODE
	$ klist -kt hdfs.keytab
	## The klist command will list the valid kerbros principal in the following format "hdfs/<NODE_FQDN>@<OUR_REALM>"

2. to kinit with the aboev listed full path
	$ kinit -kt hdfs.keytab <copy paste the any one of the hdfs principal from the above klist>