Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Kerberos slave - high availability

Kerberos slave - high availability

New Contributor

Hi,

We have Kerberos configured in our Hadoop cluster.
We did a Wizard installation (https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_sg_intro_kerb.html), it works well.

We try to have a high level of availability, we have configured a secondary kdc-server (we followed the kerberos documentation).
We have a replication of the credentials  from the first Kerberos server to the second (like in the topic : https://community.hortonworks.com/articles/92333/configure-two-kerberos-kdcs-as-a-masterslave.html)
We set Kerberos configuration on Cloudera Manager (v5.14) to add the secondary kdc server. The configuration generate by Cloudera in /etc/krb5.conf contains :

[realms]
XXXXXX.COM = {
kdc = master1.com
admin_server = master1.com
kdc = worker1.com
}


We have the following configuration:
master1 : Kerberos server + Namenode (active) HDFS
worker1 : Kerberos server + Namenode HDFS
worker2 : Kerberos client + Datanode HDFS

 


We are testing the replication of Kerberos.

Case 1 : stop Kerberos server (kdc + kadmin) on master1 and init user ticket on worker2 with kinit

It works well.

Case 2 : stop Kerberos server (kdc + kadmin) and Namenode HDFS on master1 (to simulate the crash of the server master1)

Normaly, the Namenode on worker1 should be activate. But, there is an error : "This role's process exited. This role is supposed to be started." on worker1.
Message in log:

PriviledgedActionException as:hdfs/worker1.com@XXXXXX.COM (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Connection refused (Connection refused))

 

Conclusion/Question

So my conclusion is that the Namenode on worker1 doesn't use the secondary kdc (there is nothing in the kadmin.log on the worker1).
But if I do a kinit manually, that works. So, is not a problem of Kerberos.

If the server with the main Kerberos kdc crash, the hadoop services crash too.. This is a big problem.
Do you have a solution? Or any suggestion?

 

Ps : I have already asked on this topic : http://community.cloudera.com/t5/Cloudera-Manager-Installation/kerberos-High-Availability/m-p/77651#..., but maybe is better to create a new post.


Thank you,
Martin.

4 REPLIES 4

Re: Kerberos slave - high availability

Master Collaborator

Hi,

 try to change /etc/krb5.conf to

kdc = master1.com worker1.com

And also, are you using HDFS in HA mode with JournalNodes? What are in the Journal logs?

Re: Kerberos slave - high availability

Super Guru

@martinbo,

 

Actually, the syntax suggested is not correct for recent releases of MIT Kerberos and will likely cause worse problems.

 

The syntax of your [realms] section is correct in using a separate kdc= for each kdc.

 

Please post the full /etc/krb5.conf file for your work1.com host

 

By default Java will use the following basic algorithm:

 

- Try the first kdc for the realm (master1.com in your case)

- Wait up to 30 seconds for a response

- Try the next kdc listed (in order, which is worker1.com)

- Wait up to 30 seconds for a response

- If no response... fail.

 

Based on what you have explained, you may be experiencing the issue listed here:

https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_rn_known_issues.html#conce...

 

If you don't have:

 

kdc_timeout=3000 in your [libdefaults] section of your /etc/krb5.conf file on worker1.com, then add it and retry your scenario.

 

 

Re: Kerberos slave - high availability

Master Collaborator

@bgooley,
You are right I have kdc = host1 host2
in my krb5.conf files and it works because the first host is available. 
So I should I change it for:
kdc = host1
kdc = host2

Thanks!

Re: Kerberos slave - high availability

Super Guru

@Tomas79,

 

Some implementations do (or did) support both formats, but for best results, the separate kdc= on each line should work with all current Kerberos clients