Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Where does the super user group need to be created?

avatar
Contributor

We have the superuser group defined as 'supergroup' in our configuration. However, this goup does not exist in any of the nodes. 

 

If I have to set up this group and start adding a couple of other accounts to have super usr access to hdfs, where should this Linux group be created? Should it be created in all nodes in the cluster? Or is it sufficient to create the Linux group in the Namenode hosts only?

1 ACCEPTED SOLUTION

avatar
Mentor
> We have the superuser group defined as 'supergroup' in our configuration. However, this goup does not exist in any of the nodes.

This is intentional. The default is set to a name (supergroup) that typically shouldn't exist by default, to protect against unintentional super-users right after install. You are free to modify the supergroup name via the HDFS -> Configuration -> "Superuser Group" field.

> If I have to set up this group and start adding a couple of other accounts to have super usr access to hdfs, where should this Linux group be created? Should it be created in all nodes in the cluster? Or is it sufficient to create the Linux group in the Namenode hosts only?

The general and bulletproof approach to adding Linux local groups and usernames in cluster is always "all hosts" when you use no centralized user/group management software (such as an AD via LDAP, etc.). The reason is that your host assignments are not static in the life of the cluster, so while doing the group additions on the NameNode(s) will work immediately, you will face weird authorization issues in future when a NameNode host needs to be migrated or replaced with another. Likewise when security may be turned on in future, it'd require local accounts on worker hosts.

View solution in original post

1 REPLY 1

avatar
Mentor
> We have the superuser group defined as 'supergroup' in our configuration. However, this goup does not exist in any of the nodes.

This is intentional. The default is set to a name (supergroup) that typically shouldn't exist by default, to protect against unintentional super-users right after install. You are free to modify the supergroup name via the HDFS -> Configuration -> "Superuser Group" field.

> If I have to set up this group and start adding a couple of other accounts to have super usr access to hdfs, where should this Linux group be created? Should it be created in all nodes in the cluster? Or is it sufficient to create the Linux group in the Namenode hosts only?

The general and bulletproof approach to adding Linux local groups and usernames in cluster is always "all hosts" when you use no centralized user/group management software (such as an AD via LDAP, etc.). The reason is that your host assignments are not static in the life of the cluster, so while doing the group additions on the NameNode(s) will work immediately, you will face weird authorization issues in future when a NameNode host needs to be migrated or replaced with another. Likewise when security may be turned on in future, it'd require local accounts on worker hosts.