Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sentry + Kerberos + Impala : manage users

avatar
Explorer

Hi,

We have tried to configure Sentry in our hadoop cluster. On HUE interface with “hdfs” user, queries on hive and impala work well.

We have the following configuration:
- master1: namenode + kerberos server + Sentry + Impala Catalog Server + Hue Server + Hive Metastore/Server2
- master2: namenode (standby) + Kerberos secondary server + Impala StateStore
- worker1: datanode + Impala Daemon
- worker2: datanode + Impala Daemon

After this, we want to try with a specific user and we have done the following steps:

On the master server:

#We create the user on OS (Ubuntu 14.04)
master1$ adduser dev1
master1$ addgroup sentry_dev
master1$ usermod -a -G sentry_dev dev1
#We create the user on Kerberos
master1$ sudo kadmin.local
kadmin.local: addprinc dev1
kadmin.local: exit

 

On HUE:
- Create user dev1
- Create group sentry_dev
- Put user dev1 in group sentry_dev

 

On HUE query editor:

CREATE ROLE dev_rol;
GRANT ALL ON DATABASE default TO ROLE dev_rol;
GRANT ROLE dev_rol TO GROUP sentry_dev;


After this step, dev1 has access to the database default on HUE, only with HIVE. But for Impala, there is the message "User 'dev1' does not have privileges to access: default.*".

 

After some research, we found that we need to have the user on each node. So we did this step for each node (master2, worker1, worker2) on the OS:

$ adduser dev1
$ addgroup sentry_dev
$ usermod -a -G sentry_dev dev1

 


Now we have access to Impala tables, but that means we have to create each new user on each OS node manually. Thus the more user we have, the more complicated it could be to manage it, if we have 20 users and 10 nodes etc.

 

Do you have a better solution? Is there something wrong in our configuration?

 

Thank you,
Martin.

1 ACCEPTED SOLUTION

avatar
Master Guru

@martinbo,

 

As mentioned by others, there are some options to ease the management of users and groups. 

 

Common ones are:

 

1 - SSSD, IPA, Centrify OS level integration so that application calls to the OS are handled by those apps to make queries to a central LDAP source.  This requires a good deal of configuration, but it is a robust, enterprise-grade solution

 

2 - Manage your group and passwd files with automation tools like puppet, chef, etc. (mod once, "push out" changes to all hosts)

 

3 - Configure LdapGroupsMapping in HDFS so that hadoop services will do group lookups directly to LDAP.

NOTE:  If you intend on letting users run jobs directly on YARN, you will still need to create local users on each host with a NodeManager since contains require the os user to be present.

 

 

View solution in original post

3 REPLIES 3

avatar
Contributor
You can integrate LDAP and use LDAP to create and manage users across nodes.

References: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_sg_external_auth.html

https://www.youtube.com/watch?v=Tpx0uNXJh7U

avatar
Champion

@martinbo

 

Regarding multiple user creation on multiple nodes, you have to use configuration tools like puppet, chef, ansible, etc

 

You were asking only about creating a new user in each node, but in real time, your requirement will be extended as follows:


1. Create/modify user at each node
2. Setup temporary password if you don't have sso
3. Create/modify multiple user-groups at each node (admin group, developer group, tester group, analyst, etc)
4. Assign each user to the corresponding user-groups
5. Create a home directory to the each user, setup quota if needed
6. Setup permission & owner to each home directory (as other user should not access)
7. etc

 

There are so many other activities we can do with this tool, but i've listed few based on your requirement... hope it will help

 

 

avatar
Master Guru

@martinbo,

 

As mentioned by others, there are some options to ease the management of users and groups. 

 

Common ones are:

 

1 - SSSD, IPA, Centrify OS level integration so that application calls to the OS are handled by those apps to make queries to a central LDAP source.  This requires a good deal of configuration, but it is a robust, enterprise-grade solution

 

2 - Manage your group and passwd files with automation tools like puppet, chef, etc. (mod once, "push out" changes to all hosts)

 

3 - Configure LdapGroupsMapping in HDFS so that hadoop services will do group lookups directly to LDAP.

NOTE:  If you intend on letting users run jobs directly on YARN, you will still need to create local users on each host with a NodeManager since contains require the os user to be present.