Support Questions
Find answers, ask questions, and share your expertise

Namenode not starting after kerberos enable

Explorer

Hi Team,

 

I Need your help where I have built single node cluster where i have enabled kerberos. Post that namenode is not starting up followed by other services.

I am currently using HDP 3.1.4 version.

 

my krb5.conf file looks like below

vi /etc/krb5.conf

[libdefaults]
default_ream = HADOOP.COM

[realm]
HADOOP.COM = {
kdc = test.bms.com
admin_server = test.bms.com
}

[domain_realm]
.hadoop.com = HADOOP.COM
hadoop.com = HADOOP.COM

 

Below is the error message observed during namenode start

 

020-04-24 19:02:12,623 - File['/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'] {'action': ['delete'], 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
2020-04-24 19:02:12,867 - Deleting File['/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid']
2020-04-24 19:02:12,867 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/3.1.4.0-315/hadoop/bin/hdfs --config /usr/hdp/3.1.4.0-315/hadoop/conf --daemon start namenode''] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/3.1.4.0-315/hadoop/libexec'}, 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
2020-04-24 19:02:18,008 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-testhadoop@HADOOP.COM'] {'user': 'hdfs'}
2020-04-24 19:02:19,699 - Waiting for this NameNode to leave Safemode due to the following conditions: HA: False, isActive: True, upgradeType: None
2020-04-24 19:02:19,699 - Waiting up to 19 minutes for the NameNode to leave Safemode...
2020-04-24 19:02:19,700 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://test.bms.com:8020 -safemode get | grep 'Safe mode is OFF''] {'logoutput': True, 'tries': 115, 'user': 'hdfs', 'try_sleep': 10}
safemode: Call From test.bms.com/192.168.0.107 to test.bms.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2020-04-24 19:02:40,056 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://test.bms.com:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From test.bms.com/192.168.0.107 to test.bms.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
safemode: Call From test.bms.com/192.168.0.107 to test.bms.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

 

2020-04-24 20:54:30,361 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: nn/test.bms.com@HADOOP.COM from keytab /etc/security/keytabs/nn.service.keytab javax.security.auth.login.LoginException: Message stream modified (41)
2020-04-24 20:54:30,650 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:

 

 

Thanking in Advance

2 REPLIES 2

Expert Contributor

@shrikant_bm  Are you able to do kinit on  /etc/security/keytabs/nn.service.keytab 

Explorer

@Scharanthanks for the update!!

I have build a new test cluster which i am facing the same issue again. Since the parameters used are different I have opened an new forum.

Link as below

https://community.cloudera.com/t5/Support-Questions/namenode-failed-to-start-after-enabling-kerberos...

 

Can you please check and help on the same.

Thanking in advance