About martinbo

martinbo · ‎07-27-2018

Hi, We have Kerberos configured in our Hadoop cluster. We did a Wizard installation (https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_intro_kerb.html), it works well. We try to have a high level of availability, we have configured a secondary kdc-server (we followed the kerberos documentation). We have a replication of the credentials from the first Kerberos server to the second (like in the topic : https://community.hortonworks.com/articles/92333/configure-two-kerberos-kdcs-as-a-masterslave.html) We set Kerberos configuration on Cloudera Manager to add the secondary kdc server. The configuration generate by Cloudera in /etc/krb5.conf contains : [realms] XXXXXX.COM = { kdc = master1.com admin_server = master1.com kdc = worker1.com } We have the following configuration: master1 : Kerberos server + Namenode (active) HDFS worker1 : Kerberos server + Namenode HDFS worker2 : Kerberos client + Datanode HDFS We are testing the replication of Kerberos. Case 1 : stop Kerberos server (kdc + kadmin) on master1 and init user ticket on worker2 with kinit It works well. Case 2 : stop Kerberos server (kdc + kadmin) and Namenode HDFS on master1 (to simulate the crash of the server master1) Normaly, the Namenode on worker1 should be activate. But, there is an error : "This role's process exited. This role is supposed to be started." on worker1. Message in log: PriviledgedActionException as:hdfs/worker1.com@XXXXXX.COM (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Connection refused (Connection refused)) Conclusion/Question So my conclusion is that the Namenode on worker1 doesn't use the secondary kdc (there is nothing in the kadmin.log on the worker1). But if I do a kinit manually, that works. So, is not a problem of Kerberos. If the server with the main Kerberos kdc crash, the hadoop services crash too.. This is a big problem. Do you have a solution? Or any suggestion? Thank you, Martin.

martinbo · ‎07-23-2018

Hi, We have tried to configure Sentry in our hadoop cluster. On HUE interface with “hdfs” user, queries on hive and impala work well. We have the following configuration: - master1: namenode + kerberos server + Sentry + Impala Catalog Server + Hue Server + Hive Metastore/Server2 - master2: namenode (standby) + Kerberos secondary server + Impala StateStore - worker1: datanode + Impala Daemon - worker2: datanode + Impala Daemon After this, we want to try with a specific user and we have done the following steps: On the master server: #We create the user on OS (Ubuntu 14.04) master1$ adduser dev1 master1$ addgroup sentry_dev master1$ usermod -a -G sentry_dev dev1 #We create the user on Kerberos master1$ sudo kadmin.local kadmin.local: addprinc dev1 kadmin.local: exit On HUE: - Create user dev1 - Create group sentry_dev - Put user dev1 in group sentry_dev On HUE query editor: CREATE ROLE dev_rol; GRANT ALL ON DATABASE default TO ROLE dev_rol; GRANT ROLE dev_rol TO GROUP sentry_dev; After this step, dev1 has access to the database default on HUE, only with HIVE. But for Impala, there is the message "User 'dev1' does not have privileges to access: default.*". After some research, we found that we need to have the user on each node. So we did this step for each node (master2, worker1, worker2) on the OS: $ adduser dev1 $ addgroup sentry_dev $ usermod -a -G sentry_dev dev1 Now we have access to Impala tables, but that means we have to create each new user on each OS node manually. Thus the more user we have, the more complicated it could be to manage it, if we have 20 users and 10 nodes etc. Do you have a better solution? Is there something wrong in our configuration? Thank you, Martin.

Online	Offline
Last Visited	‎11-21-2018 09:26 AM

Member Since	‎07-06-2018 01:15 AM
Last Visited	‎11-21-2018 09:26 AM
Posts	5
Kudos received	1

Cloudera Community

Re: kerberos High Availability

Sentry + Kerberos + Impala : manage users