Member since
07-06-2018
5
Posts
1
Kudos Received
0
Solutions
07-27-2018
06:05 AM
Hi, We have Kerberos configured in our Hadoop cluster. We did a Wizard installation (https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_intro_kerb.html), it works well. We try to have a high level of availability, we have configured a secondary kdc-server (we followed the kerberos documentation). We have a replication of the credentials from the first Kerberos server to the second (like in the topic : https://community.hortonworks.com/articles/92333/configure-two-kerberos-kdcs-as-a-masterslave.html) We set Kerberos configuration on Cloudera Manager to add the secondary kdc server. The configuration generate by Cloudera in /etc/krb5.conf contains : [realms]
XXXXXX.COM = {
kdc = master1.com
admin_server = master1.com
kdc = worker1.com
} We have the following configuration: master1 : Kerberos server + Namenode (active) HDFS worker1 : Kerberos server + Namenode HDFS worker2 : Kerberos client + Datanode HDFS We are testing the replication of Kerberos. Case 1 : stop Kerberos server (kdc + kadmin) on master1 and init user ticket on worker2 with kinit It works well. Case 2 : stop Kerberos server (kdc + kadmin) and Namenode HDFS on master1 (to simulate the crash of the server master1) Normaly, the Namenode on worker1 should be activate. But, there is an error : "This role's process exited. This role is supposed to be started." on worker1. Message in log: PriviledgedActionException as:hdfs/worker1.com@XXXXXX.COM (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Connection refused (Connection refused)) Conclusion/Question So my conclusion is that the Namenode on worker1 doesn't use the secondary kdc (there is nothing in the kadmin.log on the worker1). But if I do a kinit manually, that works. So, is not a problem of Kerberos. If the server with the main Kerberos kdc crash, the hadoop services crash too.. This is a big problem. Do you have a solution? Or any suggestion? Thank you, Martin.
... View more
07-24-2018
11:11 AM
1 Kudo
@martinbo, As mentioned by others, there are some options to ease the management of users and groups. Common ones are: 1 - SSSD, IPA, Centrify OS level integration so that application calls to the OS are handled by those apps to make queries to a central LDAP source. This requires a good deal of configuration, but it is a robust, enterprise-grade solution 2 - Manage your group and passwd files with automation tools like puppet, chef, etc. (mod once, "push out" changes to all hosts) 3 - Configure LdapGroupsMapping in HDFS so that hadoop services will do group lookups directly to LDAP. NOTE: If you intend on letting users run jobs directly on YARN, you will still need to create local users on each host with a NodeManager since contains require the os user to be present.
... View more