Created on 05-02-2018 06:39 PM - edited 09-16-2022 06:10 AM
Hi,
I had enabled Kerberos on my cluster w/o realizing that the hostname was never included on /etc/hosts. I went and did that and also remove and re-add Kerberos. I still cannot get rid of this error:
nn/namenodehost1.local@MYREALM.FS for zookeeper/10.169.110.22@MYREALM.FS, Server not found in Kerberos database
As if the _HOST var doesn't get translated to the host's FQDN.
Any help is really appreciated.
Sadek
Created 05-08-2018 01:51 PM
A properly functioning DNS server for your domain and functioning DNS resolvers on machines participating in your Kerberos realm is essential for the proper operation of your realm.
Kerberos can use DNS as a service location protocol, by using the DNS SRV record as defined in RFC 2052 or use a TXT record to locate the appropriate realm for a given host or domain name.
Are you using a MIT Kerberos? Can you update your krb5.conf on all the nodes by adding:
[libdefaults] rdns = false
Your problem is a DNS issue, that's the reason I wanted the entries in /etc/hosts. A workaround if you cluster is small you could propagate the correct hosts' files, while you resolve the DNS issue.
Setting Up KDC Discovery Over DNS
To use KDC discovery over DNS, the following records should be placed in the zone file corresponding to the Kerberos realm. In most cases, since the Kerberos realm name is simply an uppercase version of the DNS domain owned by the organization, these DNS entries are placed into the organization’s existing DNS zone file.
Note, however, if the Kerberos realm and DNS domain differ, then a new zone must be created with the name of the Kerberos realm typical your network team should be able to help with the DNS zone update !
Your zone file example
_kerberos._udp.MYREALM.FS. IN SRV 10 0 88 {your_kdc_server}.myrealm.fs. _kerberos._tcp.MYREALM.FS. IN SRV 10 0 88 {your_kdc_server}.myrealm.fs. _kerberos-adm._tcp.MYREALM.FS. IN SRV 1 0 749 {your_kdc_server}.myrealm.fs.
Hope that helps
Created 05-08-2018 11:53 AM
I went ahead and re-built everything from scratch and still having the same issue. Any idea where ZKFC gets its ZK connection string besides ha.zookeeper.quorum ?
Created 05-08-2018 12:42 PM
I would gladly help, but I would need you to share all the steps you executed and the below info.
# kadmin.local listprincs
Please obfuscate any hostname or sensitive info before sharing
Created 05-08-2018 01:19 PM
/var/kerberos/krb5kdc/kadm5.acl: */admin@MYREALM.FS *
Created 05-08-2018 01:23 PM
/etc/krb5.conf: [libdefaults] renew_lifetime = 7d forwardable = true default_realm = MYREALM.FS ticket_lifetime = 24h dns_lookup_realm = false dns_lookup_kdc = false default_ccache_name = /tmp/krb5cc_%{uid} #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 [logging] default = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log kdc = FILE:/var/log/krb5kdc.log [realms] MYREALM.FS = { admin_server = mykdc.local kdc = mykdc.local }
Looking at the hadoop-hdfs-zkfc log file, I am trying to figure out where zkfc gets its zk connection string from:
2018-05-07 16:12:49,965 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server 10.169.110.22/10.169.110.22:2181. Will attempt to SASL-authenticate using Login Context section 'Client'.
Created 05-08-2018 01:29 PM
I also have
export HADOOP_ZKFC_OPTS="-Dzookeeper.sasl.client=true -Dzookeeper.sasl.client.username=zookeeper -Djava.security.auth.login.config=/usr/hdp/2.6.0.3-8/hadoop/conf/secure/hdfs_jaas.conf -Dzookeeper.sasl.clientconfig=Client $HADOOP_ZKFC_OPTS"
hdfs_jaas.conf:
Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true useTicketCache=false keyTab="/etc/security/keytabs/nn.service.keytab" principal="nn/namenodehost1.local@MYREALM.FS"; };
Created 05-08-2018 01:51 PM
A properly functioning DNS server for your domain and functioning DNS resolvers on machines participating in your Kerberos realm is essential for the proper operation of your realm.
Kerberos can use DNS as a service location protocol, by using the DNS SRV record as defined in RFC 2052 or use a TXT record to locate the appropriate realm for a given host or domain name.
Are you using a MIT Kerberos? Can you update your krb5.conf on all the nodes by adding:
[libdefaults] rdns = false
Your problem is a DNS issue, that's the reason I wanted the entries in /etc/hosts. A workaround if you cluster is small you could propagate the correct hosts' files, while you resolve the DNS issue.
Setting Up KDC Discovery Over DNS
To use KDC discovery over DNS, the following records should be placed in the zone file corresponding to the Kerberos realm. In most cases, since the Kerberos realm name is simply an uppercase version of the DNS domain owned by the organization, these DNS entries are placed into the organization’s existing DNS zone file.
Note, however, if the Kerberos realm and DNS domain differ, then a new zone must be created with the name of the Kerberos realm typical your network team should be able to help with the DNS zone update !
Your zone file example
_kerberos._udp.MYREALM.FS. IN SRV 10 0 88 {your_kdc_server}.myrealm.fs. _kerberos._tcp.MYREALM.FS. IN SRV 10 0 88 {your_kdc_server}.myrealm.fs. _kerberos-adm._tcp.MYREALM.FS. IN SRV 1 0 749 {your_kdc_server}.myrealm.fs.
Hope that helps
Created 05-08-2018 09:39 PM
I think you also forgot the entry [domain_realm] I have added it to your original krb5.conf, please backup your current krb5.conf and just copy and paste the one below,
[libdefaults] renew_lifetime = 7d forwardable = true default_realm = MYREALM.FS ticket_lifetime = 24h dns_lookup_realm = false dns_lookup_kdc = false default_ccache_name = /tmp/krb5cc_%{uid} #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 [domain_realm] .myrealm.fs = MYREALM.FS myrealm.fs = MYREALM.FS [logging] default = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log kdc = FILE:/var/log/krb5kdc.log [realms] MYREALM.FS = { admin_server = mykdc.local kdc = mykdc.local }
Then restart below Kerberos daemons
# service krb5kdc start # service kadmin restart
Please let me know
Created 05-08-2018 09:55 PM
Adding all nodes in /etc/hosts across all of them fixed the problem. Thanks!