Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS Balancer - Access denied for user hdfs-prod. Superuser privilege is required

avatar
Expert Contributor

Hello experts,

While running balancer utility as HDFS user i'm getting below error.

HDFS Balancer - Access denied for user hdfs-prod. Superuser privilege is required

Note - I'm running as user hdfs against a kerberized cluster. My principal name is hdfs-prod@ABC.NET

dfs.permissions.superusergroup = hdfs

and

hdfs groups hdfs

shows

hdfs : hadoop hdfs

I'm sure I'm missing something here.

Thanks

Mayank

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Found it.

I would have left it like that however since it in Production and sometimes you just need to know.

For replication, we Falcon needs to be aware of local and remote Cluster ID via below property.

dfs.nameservices

mine has value like "prodID, devID"

What balancer does is it tries to reach both the nameservices and if I'm running the command in prod cluster as hdfs with proper ticket, it will throw a errror "hdfs-prod" (which is my principal w/o REALM) however it is still balancing the prod cluster, so the error although not clear is actually permission denied on remote name service (which makes sense) as the user still is "hdfs" however the principal is different "hdfs-dev" in my case.

I ran the same command in Dev and cluster wwas rebalanced however I got the same error, this time

Access denied for user hdfs-dev. Superuser privilege is required.

Thanks for support @emaxwell, @mqureshi, @Kuldeep Kulkarni.

I hope above answer will help others. (few of other hdfs commands have no effect/errors)

Thanks

Mayank

View solution in original post

8 REPLIES 8

avatar
Master Guru

@mkataria

Please check your core-site.xml to see if you have added auth_to_local rule

Generally it should have below rule setup

RULE:[1:$1@$0](hdfs-prod@ABC.NET)s/.*/hdfs/

Please check and add it if its not there.

avatar
Expert Contributor

thanks for pointer @Kuldeep Kulkarni however the rules are set properly. I'm pretty much able to do everything with the key tab but the balancer command.

Thanks

Mayank

avatar

@mkataria

In order to do superuser commands (like enter safe mode, balance cluster, etc.), you have to run the command as the user that started the NameNode process. If the NameNode is running as the hdfs user, then you will need to issue these commands as the hdfs user:

sudo -u hdfs hdfs balancer -threshold 5

avatar
Expert Contributor

Thanks @emaxwell,

I'm running those commands as HDFS user, the 'hdfs dfsadmin -safemode' works fine.

I can preety much use the balancer from Ambari but super curious to know what went wrong here.

Thanks

Mayank

avatar
Super Guru

@mkataria your error says you are running it as "hdfs-prod". You might have made this user a member of supergroup but that still doesn't quite make him the superuser. There is only one superuser and that's the guy who started the namenode like @emaxwell pointed out (likely username "hdfs").

Think about root user in linux. You can create other users and make them member of root group but still there is only one root user. That user in Hadoop is hdfs and no other user. More details here.

avatar
Expert Contributor

thanks @mqureshi ,

As stated above I'm running the command as "hdfs" , "hdfs-prod" is my principal name w/o the realm name.

This user is a part od superusergroup and all the mapping are showing ryt. I'm sure its something which misses the eye.

Regards

Mayank

avatar
Super Guru

@mkataria so I am assuming here is the sequence of your commands.

1. Assuming you are root, you do "su hdfs". So now you are hdfs.

2. kinit -k -t hdfs-prod@ABC.NET

3. now you run "hdfs balancer -threshold <your threshold>"

If my assumptions are right then my question would be if your principal "hdfs-prod@REALM.COM" has authority to impersonate in core-site.xml. something like below?

 <property>
     <name>hadoop.proxyuser.hdfs-prod.hosts</name>
     <value>host1,host2</value>
   </property>
   <property>
     <name>hadoop.proxyuser.hdfs-prod.groups</name>
     <value>group1,group2,supergroup,hdfs</value>
   </property>

avatar
Expert Contributor

Found it.

I would have left it like that however since it in Production and sometimes you just need to know.

For replication, we Falcon needs to be aware of local and remote Cluster ID via below property.

dfs.nameservices

mine has value like "prodID, devID"

What balancer does is it tries to reach both the nameservices and if I'm running the command in prod cluster as hdfs with proper ticket, it will throw a errror "hdfs-prod" (which is my principal w/o REALM) however it is still balancing the prod cluster, so the error although not clear is actually permission denied on remote name service (which makes sense) as the user still is "hdfs" however the principal is different "hdfs-dev" in my case.

I ran the same command in Dev and cluster wwas rebalanced however I got the same error, this time

Access denied for user hdfs-dev. Superuser privilege is required.

Thanks for support @emaxwell, @mqureshi, @Kuldeep Kulkarni.

I hope above answer will help others. (few of other hdfs commands have no effect/errors)

Thanks

Mayank