Created 08-30-2016 06:38 PM
Hello experts,
While running balancer utility as HDFS user i'm getting below error.
HDFS Balancer - Access denied for user hdfs-prod. Superuser privilege is required
Note - I'm running as user hdfs against a kerberized cluster. My principal name is hdfs-prod@ABC.NET
dfs.permissions.superusergroup = hdfs
and
hdfs groups hdfs
shows
hdfs : hadoop hdfs
I'm sure I'm missing something here.
Thanks
Mayank
Created 08-30-2016 09:56 PM
Found it.
I would have left it like that however since it in Production and sometimes you just need to know.
For replication, we Falcon needs to be aware of local and remote Cluster ID via below property.
dfs.nameservices
mine has value like "prodID, devID"
What balancer does is it tries to reach both the nameservices and if I'm running the command in prod cluster as hdfs with proper ticket, it will throw a errror "hdfs-prod" (which is my principal w/o REALM) however it is still balancing the prod cluster, so the error although not clear is actually permission denied on remote name service (which makes sense) as the user still is "hdfs" however the principal is different "hdfs-dev" in my case.
I ran the same command in Dev and cluster wwas rebalanced however I got the same error, this time
Access denied for user hdfs-dev. Superuser privilege is required.
Thanks for support @emaxwell, @mqureshi, @Kuldeep Kulkarni.
I hope above answer will help others. (few of other hdfs commands have no effect/errors)
Thanks
Mayank
Created 08-30-2016 06:42 PM
Please check your core-site.xml to see if you have added auth_to_local rule
Generally it should have below rule setup
RULE:[1:$1@$0](hdfs-prod@ABC.NET)s/.*/hdfs/
Please check and add it if its not there.
Created 08-30-2016 06:48 PM
thanks for pointer @Kuldeep Kulkarni however the rules are set properly. I'm pretty much able to do everything with the key tab but the balancer command.
Thanks
Mayank
Created 08-30-2016 07:01 PM
In order to do superuser commands (like enter safe mode, balance cluster, etc.), you have to run the command as the user that started the NameNode process. If the NameNode is running as the hdfs user, then you will need to issue these commands as the hdfs user:
sudo -u hdfs hdfs balancer -threshold 5
Created 08-30-2016 07:39 PM
Thanks @emaxwell,
I'm running those commands as HDFS user, the 'hdfs dfsadmin -safemode' works fine.
I can preety much use the balancer from Ambari but super curious to know what went wrong here.
Thanks
Mayank
Created 08-30-2016 08:05 PM
@mkataria your error says you are running it as "hdfs-prod". You might have made this user a member of supergroup but that still doesn't quite make him the superuser. There is only one superuser and that's the guy who started the namenode like @emaxwell pointed out (likely username "hdfs").
Think about root user in linux. You can create other users and make them member of root group but still there is only one root user. That user in Hadoop is hdfs and no other user. More details here.
Created 08-30-2016 08:30 PM
thanks @mqureshi ,
As stated above I'm running the command as "hdfs" , "hdfs-prod" is my principal name w/o the realm name.
This user is a part od superusergroup and all the mapping are showing ryt. I'm sure its something which misses the eye.
Regards
Mayank
Created 08-30-2016 08:37 PM
@mkataria so I am assuming here is the sequence of your commands.
1. Assuming you are root, you do "su hdfs". So now you are hdfs.
2. kinit -k -t hdfs-prod@ABC.NET
3. now you run "hdfs balancer -threshold <your threshold>"
If my assumptions are right then my question would be if your principal "hdfs-prod@REALM.COM" has authority to impersonate in core-site.xml. something like below?
<property> <name>hadoop.proxyuser.hdfs-prod.hosts</name> <value>host1,host2</value> </property> <property> <name>hadoop.proxyuser.hdfs-prod.groups</name> <value>group1,group2,supergroup,hdfs</value> </property>
Created 08-30-2016 09:56 PM
Found it.
I would have left it like that however since it in Production and sometimes you just need to know.
For replication, we Falcon needs to be aware of local and remote Cluster ID via below property.
dfs.nameservices
mine has value like "prodID, devID"
What balancer does is it tries to reach both the nameservices and if I'm running the command in prod cluster as hdfs with proper ticket, it will throw a errror "hdfs-prod" (which is my principal w/o REALM) however it is still balancing the prod cluster, so the error although not clear is actually permission denied on remote name service (which makes sense) as the user still is "hdfs" however the principal is different "hdfs-dev" in my case.
I ran the same command in Dev and cluster wwas rebalanced however I got the same error, this time
Access denied for user hdfs-dev. Superuser privilege is required.
Thanks for support @emaxwell, @mqureshi, @Kuldeep Kulkarni.
I hope above answer will help others. (few of other hdfs commands have no effect/errors)
Thanks
Mayank