Support Questions
Find answers, ask questions, and share your expertise

namenode failed to start after enabling kerberos

Explorer

Hi Team,

 

I am currently using HDP3.0 and ambari 2.7.3. I have enabled Kerberos from ambari.

KDC is installed and configured. Able to kinit and create principal.

I tried below thing,

kinit -kt /etc/security/keytabs/nn.service.keytab nn/mastern1.bms.com@BMS.COM
[root@mastern1 ~]# klist -e
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: nn/mastern1.bms.com@BMS.COM

Valid starting Expires Service principal
06/04/2020 20:19:07 06/05/2020 20:19:07 krbtgt/BMS.COM@BMS.COM
Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

 

[root@mastern1 ~]# cat /etc/krb5.conf

[libdefaults]
renew_lifetime = 7d
forwardable = true
default_realm = BMS.COM
ticket_lifetime = 24h
dns_lookup_realm = false
dns_lookup_kdc = false
default_ccache_name = /tmp/krb5cc_%{uid}
#default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
#default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
udp_preference_limit = 1

[domain_realm]
bms.com = BMS.COM

[logging]
default = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
kdc = FILE:/var/log/krb5kdc.log

[realms]
BMS.COM = {
admin_server = mastern1.bms.com
kdc = mastern1.bms.com
}

 

 

ERROR that i see below is

 

STARTUP_MSG: java = 1.8.0_252
************************************************************/
2020-06-04 20:13:01,750 INFO namenode.NameNode (LogAdapter.java:info(51)) - registered UNIX signal handlers for [TERM, HUP, INT]
2020-06-04 20:13:02,322 INFO namenode.NameNode (NameNode.java:createNameNode(1583)) - createNameNode []
2020-06-04 20:13:03,145 INFO impl.MetricsConfig (MetricsConfig.java:loadFirst(118)) - Loaded properties from hadoop-metrics2.properties
2020-06-04 20:13:05,390 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(85)) - Initializing Timeline metrics sink.
2020-06-04 20:13:05,390 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(105)) - Identified hostname = mastern1.bms.com, serviceName = namenode
2020-06-04 20:13:06,516 WARN availability.MetricCollectorHAHelper (MetricCollectorHAHelper.java:findLiveCollectorHostsFromZNode(90)) - Unable to connect to zookeeper.
org.apache.ambari.metrics.sink.relocated.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ambari-metrics-cluster
at org.apache.ambari.metrics.sink.relocated.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.ambari.metrics.sink.relocated.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.ambari.metrics.sink.relocated.zookeeper.ZooKeeper.exists(ZooKeeper.java:1909)
at org.apache.ambari.metrics.sink.relocated.zookeeper.ZooKeeper.exists(ZooKeeper.java:1937)
at org.apache.hadoop.metrics2.sink.timeline.availability.MetricCollectorHAHelper.findLiveCollectorHostsFromZNode(MetricCollectorHAHelper.java:77)
at org.apache.hadoop.metrics2.sink.timeline.AbstractTimelineMetricsSink.findPreferredCollectHost(AbstractTimelineMetricsSink.java:540)
at org.apache.hadoop.metrics2.sink.timeline.HadoopTimelineMetricsSink.init(HadoopTimelineMetricsSink.java:125)
at org.apache.hadoop.metrics2.impl.MetricsConfig.getPlugin(MetricsConfig.java:207)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.newSink(MetricsSystemImpl.java:531)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSinks(MetricsSystemImpl.java:503)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:479)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:188)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:163)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:62)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:58)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1642)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
2020-06-04 20:13:07,547 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(133)) - No suitable collector found.
2020-06-04 20:13:07,551 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(185)) - RPC port properties configured: {8020=client}
2020-06-04 20:13:07,618 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(204)) - Sink timeline started
2020-06-04 20:13:08,229 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(374)) - Scheduled Metric snapshot period at 10 second(s).
2020-06-04 20:13:08,229 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:start(191)) - NameNode metrics system started
2020-06-04 20:13:08,480 INFO namenode.NameNodeUtils (NameNodeUtils.java:getClientNamenodeAddress(79)) - fs.defaultFS is hdfs://mastern1.bms.com:8020
2020-06-04 20:13:08,480 INFO namenode.NameNode (NameNode.java:<init>(928)) - Clients should use mastern1.bms.com:8020 to access this namenode/service.
2020-06-04 20:13:10,005 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: nn/mastern1.bms.com@BMS.COM from keytab /etc/security/keytabs/nn.service.keytab javax.security.auth.login.LoginException: Message stream modified (41)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1847)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1215)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1008)
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:313)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:661)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:680)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: javax.security.auth.login.LoginException: Message stream modified (41)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:808)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:618)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
... 9 more
Caused by: KrbException: Message stream modified (41)
at sun.security.krb5.KrbKdcRep.check(KrbKdcRep.java:101)
at sun.security.krb5.KrbAsRep.decrypt(KrbAsRep.java:159)
at sun.security.krb5.KrbAsRep.decryptUsingKeyTab(KrbAsRep.java:121)
at sun.security.krb5.KrbAsReqBuilder.resolve(KrbAsReqBuilder.java:308)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:447)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:780)
... 23 more
2020-06-04 20:13:10,011 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: nn/mastern1.bms.com@BMS.COM from keytab /etc/security/keytabs/nn.service.keytab javax.security.auth.login.LoginException: Message stream modified (41)
2020-06-04 20:13:10,132 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:

 

 

When i start namenode service from ambari below is the message that i see

2020-06-04 20:13:01,851 - Waiting for this NameNode to leave Safemode due to the following conditions: HA: False, isActive: True, upgradeType: None
2020-06-04 20:13:01,852 - Waiting up to 19 minutes for the NameNode to leave Safemode...
2020-06-04 20:13:01,852 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://mastern1.bms.com:8020 -safemode get | grep 'Safe mode is OFF''] {'logoutput': True, 'tries': 115, 'user': 'hdfs', 'try_sleep': 10}
safemode: Call From mastern1.bms.com/192.168.0.109 to mastern1.bms.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2020-06-04 20:13:14,639 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://mastern1.bms.com:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From mastern1.bms.com/192.168.0.109 to mastern1.bms.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
safemode: Call From mastern1.bms.com/192.168.0.109 to mastern1.bms.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2


Please help me in fixing the issue

 

 

 

6 REPLIES 6

Re: namenode failed to start after enabling kerberos

Expert Contributor

@shrikant_bm Can you confirm java version

Re: namenode failed to start after enabling kerberos

Explorer

@Scharan,

java version: 1.8.0_252

java path in .bashrc 

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/

Re: namenode failed to start after enabling kerberos

Explorer

I am not getting any help on this post. Any reason?

Is there anything missing in my post?

Please help me in providing some solution.

 

@Scharan

Re: namenode failed to start after enabling kerberos

Expert Contributor

@shrikant_bm  Can you try changin "sun.security.krb5.disableReferrals=false" to "sun.security.krb5.disableReferrals=true" in java.security file under JDK HOME on namenode host
Example: /usr/java/jdk1.8.0_252/jre/lib/security/java.security file.

 

 

Re: namenode failed to start after enabling kerberos

Explorer

@Scharan , after making the changes to true still i am facing the same issue.

 

Can you please help me with good article or document to install and configure kerberos and enable kerberos from ambari.

 

Re: namenode failed to start after enabling kerberos

New Contributor

@shrikant_bmSimilar issue for me got resolved after removing the 'renew_lifetime' line /etc/krb5.conf.

The following link also provides additional information regarding this issue:

https://community.cloudera.com/t5/Community-Articles/How-to-solve-the-Message-stream-modified-41-err...