Maybe a dummy question here. For a unix user on data node to performance hdfs operation, does the same user also need to be created on name node ?
From below document, the answer seems to be Yes. The user/group has to be exist on name node as the mapping is "performed on the NameNode"
However, from testing i did, it seems the user on data node can do hdfs command without the same user on Namenode.
Did i miss something here ?
" For HDFS, the mapping of users to groups is performed on the NameNode. Thus, the host system configuration of the NameNode determines the group mappings for the users."
- On data node. create a unix user (aniu), add it to hadoop group. This user does not exist on Namenode.
- $ id uid=1012(aniu) gid=1012(aniu) groups=1012(aniu),1001(hadoop)
$ hdfs dfs -mkdir /tmp/aniu
$ hdfs dfs -ls /tmp
drwxr-xr-x - aniu hdfs 0 2016-11-12 08:22 /tmp/aniu
Assuming that you have issued the hdfs command as user, ainu, then this command is functioning properly because ainu is the owner.
If you issued the command as another user, the command should still function as you list because the ‘other users’ have read permission on the directory.
The group mappings and permissions may not need to be accessed as the HDFSPermissionGuide that you reference says the following are checked:
Each client process that accesses HDFS has a two-part identity composed of the user name, and groups list. Whenever HDFS must do a permissions check for a file or directory foo accessed by a client process,
So, ainu is either the owner (no group mappings necessary), in a group [no group mapping exists], so other permissions are tested (read).