Support Questions

Find answers, ask questions, and share your expertise

User management on each node

avatar

Hi, I have a HDP cluster running with AD authentication for Ranger and Zeppelin.

I noticed that in order for Hive to be accessible for a given AD user or group which has been allowed by a Ranger ACL, that username/group must exist on the Hive server (eg useradd some-ad-user -G some-ad-group).

A similar behavior happens with HDFS access. I can make the Ranger ACL stick by specifying it by username, but not group, without requiring a user to be setup on the name node.

The necessity of this of course seems sensible enough. However I'm not uncertain as to the proper means to manage user accounts for each Linux machine. Do I need to mirror every AD account/group on every cluster node, a subset of service nodes, or is there a third option which is correct? It seems to defeat the purpose for me to use active directory if I must manage users/groups across the entire cluster anyway. I thought perhaps Knox is the solution for this, which I'm in the middle of configuring, but I thought I'd ask the question in case the pursuit is fruitless.

Thank you.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Lindsay Gaff

The best practice is to confine all the user's to the edge node, make sure you have all the clients eg oozie,hive,hdfs, zookeeper etc installed on this host as these client configs will be updated automatically by Ambari with the correct files during their installation.

Once done your users should be able to execute any task on the cluster from the edge node. As you remarked it's impractical to have users on all the nodes ...YES that defeats the reason for having centarlized control.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@Lindsay Gaff

The best practice is to confine all the user's to the edge node, make sure you have all the clients eg oozie,hive,hdfs, zookeeper etc installed on this host as these client configs will be updated automatically by Ambari with the correct files during their installation.

Once done your users should be able to execute any task on the cluster from the edge node. As you remarked it's impractical to have users on all the nodes ...YES that defeats the reason for having centarlized control.

avatar

@Geoffrey Shelton Okot

Thanks. What is the recommended method for syncing users and groups to the edge node? Can I use PAM/LDAP on these nodes to keep it all tied together? or do I still need to manually manage user accounts on the cmdline?

avatar
Master Mentor

@Lindsay Gaff

Ranger usually does that for you once you have configured the LDAP authentication 🙂 by running periodically user sync process manual maintenance is just not workable. Ranger LDAP integration

HTP

avatar
Master Mentor

@Lindsay Gaff

If you found this answer addressed your question, please take a moment to log in and click the "accept" link on the answer. That would be a great help to Community users to find the solution quickly for these kinds of errors.