Created on 12-25-201610:49 AM - edited 09-16-202201:37 AM
SYMPTOM: All the services in the cluster are down and restarting the services fails with the following error:
2016-11-17 21:42:18,235 ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start namenode.
java.io.IOException: Login failure for nn/lxxx.examplet.ex.com@EXAMPLE.AD.EX.COM from keytab /etc/security/keytabs/nn.service.keytab: javax.security.auth.login.LoginException: Client not found in Kerberos database (6)
...
Caused by: KrbException: Client not found in Kerberos database (6)
...
Caused by: KrbException: Identifier doesn't match expected value (906)
Regeneration of Keytabs using Ambari too failed as follows:
17 Nov 2016 23:58:59,136 WARN [Server Action Executor Worker 12702] CreatePrincipalsServerAction:233 - Principal, HTTP/xxx.examplet.ex.com@EXAMPLE.AD.EX.COM, does not exist, creating new principal
17 Nov 2016 23:58:59,151 ERROR [Server Action Executor Worker 12702] CreatePrincipalsServerAction:284 - Failed to create or update principal, HTTP/xxx.examplet.ex.com@EXAMPLE.AD.EX.COM - Can not create principal : HTTP/xxx.examplet.ex.com@EXAMPLE.AD.EX.COM
org.apache.ambari.server.serveraction.kerberos.KerberosOperationException: Can not create principal : HTTP/xxx.examplet.ex.com@EXAMPLE.AD.EX.COM
Caused by: javax.naming.NameAlreadyBoundException: [LDAP: error code 68 - 00002071: UpdErr: DSID-0305038D, problem 6005 (ENTRY_EXISTS), data 0
]; remaining name '"cn=HTTP/lxxx.examplet.ex.com,OU=Hadoop,OU=EXAMPLE_Users,DC=examplet,DC=ad,DC=ex,DC=com"'
ROOT CAUSE: Wrong entries in all service accounts(VPN) in AD. Characters '/' was replaced with '_' by a wrong script.
RESOLUTION: Fix the issue in the AD service accounts. In the above case, all '_' was replaced with '/' in the service accounts in AD.