- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to get the list of users
- Labels:
-
Apache Hadoop
Created ‎08-19-2016 03:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Recently I was given an Hadoop Eco system to support. In this system, there is no Ranger, LDAP..etc etc..and the access was given directly to the boxes. Could you please suggest me some ways to get the list of users who uses our Hadoop System ?
Created ‎08-19-2016 03:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
First the easy part. Let's assume Kerberos is enabled. Run "listprincs" using "kadmin" to find the service principals. Without LDAP and Kerberos enabled these are the users who have access to your cluster.
If Kerberos is not enabled, then pretty much all users on your cluster machines should be able to access your cluster.
Created ‎08-23-2016 03:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mqureshi Kerberos is not enabled.
Is it enough if I pull the users from the name node and admin node ?
Created ‎08-23-2016 03:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Without Kerberos, pretty much anyone can access your cluster. Your list of users who can access cluster is anyone who has access to the linux machines where cluster is running.
Created ‎08-23-2016 04:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We can find the list of users in linux and Hadoop as well:
1) List of users in Linux:
awk -F':' '{ print $1}' /etc/passwd
2) List of user in hadoop:
Go to Hue:
Look for user admin tab like hive & pig tabs in Hue
3) Check the users in HDFS:
ideally we can find user directory in hdfs as well /usr/<userID>/
Created ‎08-23-2016 04:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
+1, small correction that the HDFS directories will be under "/user" not "/usr":
hdfs dfs -ls /user
Created ‎08-23-2016 10:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Assuming that you are only interested who has access to Hadoop services, extract all OS users from all nodes by checking /etc/passwd file content. Some of them are legitimate users needed by Hadoop tools, e.g. hive, hdfs, etc.For hdfs, they will have a /user/username folder in hdfs. You can see that with hadoop -fs ls -l /user executed as a user member of the hadoop group. If they have access to hive client, they are able to also perform DDL and DML actions in Hive.
The above will allow you to understand the current state, however, this is your opportunity to improve security even without the bells and whistles of Kerberos/LDAP/Ranger. You can force the users to access Hadoop ecosystem client services via a few client/edge nodes, where only client services are running, e.g. Hive client. Users, other than power users, should not have accounts on name node, admin node or data nodes. Any user that can access those nodes where client services are running can access those services, e.g. hdfs or Hive.
