Support Questions
Find answers, ask questions, and share your expertise

Server has invalid Kerberos principal

After kerberising the cluster (HDP 2.3.2 on SLES 11.4) many services doesn't start anymore:

Nimbus, Storm Supervisor, HBase Master, Phoenix and the YARN node managers on the nodes other than the Resource Manager

In detail, the YARN log contain the following:

org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: rm/<res_mgr_host>@hdp23cluster; Host Details : local host is: "<nod_mgr_host>/<nod_mgr_ip>"; destination host is: "<res_mgr_host>":8025;

On <res_mgr_host> runs the YARN RM, on <nod_mgr_host> run an additional node manager

I've installed a standard kerberos server using zypper and succesfully configured kerberos in Ambari leaving all default values.

On all nodes is configured a proxy, additionally, the no_proxy system variable contains the list of hosts for which the proxy should be ignored: all node hosts + other hosts.

What could be wrong?

14 REPLIES 14

I only expect a keytab file to work on the particular host it was distributed to. This is because the service principals have the hostname where the service is running embedded in its name. So it is not recommended to copy them around.

That said, you might want to make sure that the hostname of the hosts is being represented the same via the different mechanisms for getting the host's name.

For example, hostname -f should be the fully qualified domain name (FQDN) of the host and return the same FQDN that was used to register with Ambari.

Expert Contributor

This appears to be FQDN issue. Does your DNS resolution happen through a DNS server or hosts file? if it is hosts file make sure all nodes have fqdn followed by their assigned IP address.

View solution in original post

@Robert Levas @Pranay Vyas

Name resolution works over a DNS server, but Kerberos seems to ignore it.

Adding IP/Hosts to the /etc/hosts file seems to help, so thank you for the tip!

However, this doesn't solve the problem but generate a different error message:

org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.authorize.AuthorizationException: User nm/msas6502i.msg.de@HDP23CLUSTER (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is nm/10.100.233.13@HDP23CLUSTER

I had to unkerberize and rekerberize the cluster, now it works!

@Robert Levas @Pranay Vyas It was definitively a DNS problem: Kerberos can't use the DNS, it can resolve names only over /etc/hosts

Many thanks!