Created 06-01-2017 05:08 PM
Hi all,
I am having a problem with the NameNode status ambari shows. The following points are verifiable in the system: - The NameNode keeps going down a few seconds after I start it through ambari (it looks like it never really goes up, but the start process run successfully);
- Despite being DOWN according to ambari, if I run JPS in the server the NameNode is hosted it shows that the service is running:
[hdfs@RHTPINEC008 ~]$ jps 39395 NameNode 4463 Jps
and I can access NameNode UI properly;
- I already restarted both the namenode and ambari-agent the manually but the behavior keeps the same;
- This problem started after some HBase/Phoenix heavy queries that caused the namenode to go down (not sure if this is actually related but the exact same configurations were working well before this episode);
- I've been digging for some hours and I am not being able to find error details in the namenode logs nor in the ambari-agent logs that allows me to understand the problem;
I am using hdp 2.4.0 and no HA options.
Can someone help in this?
Thanks in advance
Created 05-08-2018 10:41 PM
Any updates?
Created 05-09-2018 08:48 AM
sorry i work till 2PM EST thats why delay in answering. I am using AD and users already got created in the AD before HDP installation Yes One way trust made .
hostName=node1.test.co
Contents of /etc/krb5.conf :
includedir /etc/krb5.conf.d/
includedir /var/lib/sss/pubconf/krb5.include.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
# default_realm = EXAMPLE.COM
default_ccache_name = KEYRING:persistent:%{uid}
default_realm = TEST.CO
[realms]
# EXAMPLE.COM = {
# kdc = kerberos.example.com
# admin_server = kerberos.example.com
# }
TEST.CO = {
}
[domain_realm]
# .example.com = EXAMPLE.COM
# example.com = EXAMPLE.COM
test.co = TEST.CO
.test.co = TEST.CO
Created 05-09-2018 10:50 AM
When using MIT KDC there are 3 important files that MUST be set correctly for Kerberos to function. Their locations might vary depending on the OS.
/var/kerberos/krb5kdc/kdc.conf /var/kerberos/krb5kdc/kadm5.acl /etc/krb5.conf
I have seen a couple of issues in your krb5.conf. I have corrected it replace the {your_kdc_server} see below with your KDC FQDN
Back your current krb5.conf
# cp /etc/krb5.conf /etc/krb5.conf.bak
The edit the /etc/krb5.conf delete the contents and replace them with the below
# vi /etc/krb5.conf
Paste
[libdefaults] renew_lifetime = 7d forwardable = true default_realm = TEST.CO dns_lookup_realm = false ticket_lifetime = 24h rdns = false default_ccache_name = KEYRING:persistent:%{uid} [domain_realm] test.co = TEST.CO .test.co = TEST.CO [logging] default = FILE:/var/log/krb5libs.log kdc = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log [realms] TEST.CO = { admin_server = {your_kdc_server} kdc = {your_kdc_server} }
/var/kerberos/krb5kdc/kdc.conf
[kdcdefaults] kdc_ports = 88 kdc_tcp_ports = 88 [realms] TEST.CO = { #master_key_type = aes256-cts acl_file = /var/kerberos/krb5kdc/kadm5.acl dict_file = /usr/share/dict/words admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal }
Your /var/kerberos/krb5kdc/kadm5.acl should look like this note the spacing for the last *
*/admin@TEST.CO *
Restart the KDC daemons
# service /krb5kdc start # service kadmin start
Please correct the above files so we know the Kerberos is correctly set and revert with the new error if any.
Created 05-09-2018 11:01 AM
Thank you so much. we dont have KDC server installed , we are using LDAP. do i need to mention AD server in the place of "{your kdc server}?
admin_server ={your_kdc_server}
Created 05-09-2018 11:03 AM
If your ticket grantor is AD the YES replace it accordingly
Created 05-09-2018 11:07 AM
Created 05-11-2018 02:36 PM
@Saulo Sobreiro
@Subramanian Govindasamy
Any updates?
Created 05-13-2018 07:09 PM
Make sure you have odd numbers of JN. All the JNs are healthy.
Created 05-25-2018 10:57 AM
Sorry for the delay. Horton Work Support claimed that '@' symbol is not support, so we are doing reinstall freshly with local user and sync with LDAP.
Thanks