Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cluster non operational after enabling Kerberos.

Solved Go to solution

Cluster non operational after enabling Kerberos.

New Contributor

Hi,

I had enabled Kerberos on my cluster w/o realizing that the hostname was never included on /etc/hosts. I went and did that and also remove and re-add Kerberos. I still cannot get rid of this error:

nn/namenodehost1.local@MYREALM.FS for zookeeper/10.169.110.22@MYREALM.FS, Server not found in Kerberos database

As if the _HOST var doesn't get translated to the host's FQDN.

Any help is really appreciated.

Sadek

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Cluster non operational after enabling Kerberos.

Mentor

@Sadek M

A properly functioning DNS server for your domain and functioning DNS resolvers on machines participating in your Kerberos realm is essential for the proper operation of your realm.

Kerberos can use DNS as a service location protocol, by using the DNS SRV record as defined in RFC 2052 or use a TXT record to locate the appropriate realm for a given host or domain name.

Are you using a MIT Kerberos? Can you update your krb5.conf on all the nodes by adding:

[libdefaults] 
    rdns = false 

Your problem is a DNS issue, that's the reason I wanted the entries in /etc/hosts. A workaround if you cluster is small you could propagate the correct hosts' files, while you resolve the DNS issue.

Setting Up KDC Discovery Over DNS

To use KDC discovery over DNS, the following records should be placed in the zone file corresponding to the Kerberos realm. In most cases, since the Kerberos realm name is simply an uppercase version of the DNS domain owned by the organization, these DNS entries are placed into the organization’s existing DNS zone file.

Note, however, if the Kerberos realm and DNS domain differ, then a new zone must be created with the name of the Kerberos realm typical your network team should be able to help with the DNS zone update !

Your zone file example

_kerberos._udp.MYREALM.FS.     IN SRV 10 0 88  {your_kdc_server}.myrealm.fs.
_kerberos._tcp.MYREALM.FS.     IN SRV 10 0 88  {your_kdc_server}.myrealm.fs.
_kerberos-adm._tcp.MYREALM.FS. IN SRV 1  0 749 {your_kdc_server}.myrealm.fs.

Hope that helps

8 REPLIES 8

Re: Cluster non operational after enabling Kerberos.

New Contributor

I went ahead and re-built everything from scratch and still having the same issue. Any idea where ZKFC gets its ZK connection string besides ha.zookeeper.quorum ?

Re: Cluster non operational after enabling Kerberos.

Mentor

@Sadek M

I would gladly help, but I would need you to share all the steps you executed and the below info.

  • HDP /Ambari versions
  • Cluster OS
  • /etc/hosts entry [Assuming they are identical]
  • Number of nodes in Cluster
  • KDC setup process
  • Ambari Kerberos enabling errors.
  • Output of
# kadmin.local           
listprincs
  • kadm5.acl usually in /var/kerberos/krb5kdc
  • krb5.conf in /etc

Please obfuscate any hostname or sensitive info before sharing

Re: Cluster non operational after enabling Kerberos.

New Contributor
/var/kerberos/krb5kdc/kadm5.acl:
*/admin@MYREALM.FS    *

Re: Cluster non operational after enabling Kerberos.

New Contributor
/etc/krb5.conf:


[libdefaults]
  renew_lifetime = 7d
  forwardable = true
  default_realm = MYREALM.FS
  ticket_lifetime = 24h
  dns_lookup_realm = false
  dns_lookup_kdc = false
  default_ccache_name = /tmp/krb5cc_%{uid}
  #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
  #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
 
[logging]
  default = FILE:/var/log/krb5kdc.log
  admin_server = FILE:/var/log/kadmind.log
  kdc = FILE:/var/log/krb5kdc.log
 
[realms]
  MYREALM.FS = {
    admin_server = mykdc.local
    kdc = mykdc.local
  }

Looking at the hadoop-hdfs-zkfc log file, I am trying to figure out where zkfc gets its zk connection string from:

2018-05-07 16:12:49,965 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server 10.169.110.22/10.169.110.22:2181. Will attempt to SASL-authenticate using Login Context section 'Client'.

Re: Cluster non operational after enabling Kerberos.

New Contributor

I also have

 export HADOOP_ZKFC_OPTS="-Dzookeeper.sasl.client=true 
 -Dzookeeper.sasl.client.username=zookeeper 
 -Djava.security.auth.login.config=/usr/hdp/2.6.0.3-8/hadoop/conf/secure/hdfs_jaas.conf 
 -Dzookeeper.sasl.clientconfig=Client $HADOOP_ZKFC_OPTS"

hdfs_jaas.conf:

Client {
      com.sun.security.auth.module.Krb5LoginModule required
      useKeyTab=true
      storeKey=true
      useTicketCache=false
      keyTab="/etc/security/keytabs/nn.service.keytab"
      principal="nn/namenodehost1.local@MYREALM.FS";
};

Re: Cluster non operational after enabling Kerberos.

Mentor

@Sadek M

A properly functioning DNS server for your domain and functioning DNS resolvers on machines participating in your Kerberos realm is essential for the proper operation of your realm.

Kerberos can use DNS as a service location protocol, by using the DNS SRV record as defined in RFC 2052 or use a TXT record to locate the appropriate realm for a given host or domain name.

Are you using a MIT Kerberos? Can you update your krb5.conf on all the nodes by adding:

[libdefaults] 
    rdns = false 

Your problem is a DNS issue, that's the reason I wanted the entries in /etc/hosts. A workaround if you cluster is small you could propagate the correct hosts' files, while you resolve the DNS issue.

Setting Up KDC Discovery Over DNS

To use KDC discovery over DNS, the following records should be placed in the zone file corresponding to the Kerberos realm. In most cases, since the Kerberos realm name is simply an uppercase version of the DNS domain owned by the organization, these DNS entries are placed into the organization’s existing DNS zone file.

Note, however, if the Kerberos realm and DNS domain differ, then a new zone must be created with the name of the Kerberos realm typical your network team should be able to help with the DNS zone update !

Your zone file example

_kerberos._udp.MYREALM.FS.     IN SRV 10 0 88  {your_kdc_server}.myrealm.fs.
_kerberos._tcp.MYREALM.FS.     IN SRV 10 0 88  {your_kdc_server}.myrealm.fs.
_kerberos-adm._tcp.MYREALM.FS. IN SRV 1  0 749 {your_kdc_server}.myrealm.fs.

Hope that helps

Re: Cluster non operational after enabling Kerberos.

Mentor

@Sadek M

I think you also forgot the entry [domain_realm] I have added it to your original krb5.conf, please backup your current krb5.conf and just copy and paste the one below,

[libdefaults]
  renew_lifetime = 7d
  forwardable = true
  default_realm = MYREALM.FS
  ticket_lifetime = 24h
  dns_lookup_realm = false
  dns_lookup_kdc = false
  default_ccache_name = /tmp/krb5cc_%{uid}
  #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
  #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5

[domain_realm]
  .myrealm.fs = MYREALM.FS
  myrealm.fs = MYREALM.FS

[logging]
  default = FILE:/var/log/krb5kdc.log
  admin_server = FILE:/var/log/kadmind.log
  kdc = FILE:/var/log/krb5kdc.log

[realms]
  MYREALM.FS = {
    admin_server = mykdc.local
    kdc = mykdc.local
  }

Then restart below Kerberos daemons

# service krb5kdc start 
# service kadmin restart

Please let me know

Re: Cluster non operational after enabling Kerberos.

New Contributor

Adding all nodes in /etc/hosts across all of them fixed the problem. Thanks!