Created 05-10-2016 09:22 AM
After kerberising the cluster (HDP 2.3.2 on SLES 11.4) many services doesn't start anymore:
Nimbus, Storm Supervisor, HBase Master, Phoenix and the YARN node managers on the nodes other than the Resource Manager
In detail, the YARN log contain the following:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: rm/<res_mgr_host>@hdp23cluster; Host Details : local host is: "<nod_mgr_host>/<nod_mgr_ip>"; destination host is: "<res_mgr_host>":8025;
On <res_mgr_host> runs the YARN RM, on <nod_mgr_host> run an additional node manager
I've installed a standard kerberos server using zypper and succesfully configured kerberos in Ambari leaving all default values.
On all nodes is configured a proxy, additionally, the no_proxy system variable contains the list of hosts for which the proxy should be ignored: all node hosts + other hosts.
What could be wrong?
Created 05-10-2016 08:40 PM
This appears to be FQDN issue. Does your DNS resolution happen through a DNS server or hosts file? if it is hosts file make sure all nodes have fqdn followed by their assigned IP address.
Created on 05-10-2016 09:29 AM - edited 08-19-2019 01:23 AM
Can you login to ambari and click on Kerberos->Advance and check if the principals are present /created for respective services properly ? Check the screenshot below -
Created 05-10-2016 09:32 AM
Were there any issue while kerberizing the cluster? Did it fininshed without any error ?
Created 05-10-2016 01:00 PM
Unless you changed this when editing your log entry for the post, your realm is incorrect. You have "hdp23cluster" as your realm when it should be in all upper case characters - "HDP23CLUSTER".
To change this, your best bet is to disable Kerberos and then re-enable Kerberos with the correct realm.
Created 05-10-2016 01:43 PM
@Sagar Shimpi: the user exists.
Created 05-10-2016 01:45 PM
@Robert Levas: The realm ist in lower case also in kerberos. Should I indeed enter it in uppercase in Ambari?
Created 05-10-2016 02:01 PM
Technically, if the realm matches in the KDC, the /etc/krb5.conf file, and Ambari, all should work. But I have seen that the MIT Kerberos libraries tend to assume the realm is all uppercase - or maybe it is the internal Hadoop Kerberos logic.
You can check the MIT libray case by attempting to manually kinit and see if it works.
kinit -kt /etc/security/keytabs/rm.service.keytab rm/<res_mgr_host>@hdp23cluster
In any case, I would disable Kerberos in Ambari, rebuild the KDC using the uppercase form of the realm, and then re-enable Kerberos. If it doesn't work after this, we can at least rule out the case-sensitivity issue.
Created 05-10-2016 04:09 PM
Thank you for the tip, after recreating and re-configuring everything with an uppercase realm, it still doesn't work.
However, I have noticed that the keytab rm.service.keytab is present on the RM host, but not in the other hosts.
Should the keytab be present on every host? If yes, than the automatic deployment of the keytabs doesn't work well.
Keytabs on the non-RM node:
dn.service.keytab hbase.headless.keytab hdfs.headless.keytab knox.service.keytab nfs.service.keytab nm.service.keytab nn.service.keytab smokeuser.headless.keytab spark.headless.keytab spnego.service.keytab zk.service.keytab
Keytabs on the RM node:
dn.service.keytab hbase.headless.keytab hdfs.headless.keytab hive.service.keytab jhs.service.keytab nfs.service.keytab nm.service.keytab nn.service.keytab oozie.service.keytab rm.service.keytab sbetp.headless.keytab smokeuser.headless.keytab spark.headless.keytab spnego.service.keytab yarn.service.keytab zk.service.keytab
Created 05-10-2016 04:54 PM
The keytabs are only distributed to the hosts on which they are needed. So I do not expect all keytab file to be distributed to all hosts.
Created 05-10-2016 04:53 PM
The command
kinit -kt /etc/security/keytabs/rm.service.keytab rm/<res_mgr_host>@hdp23cluster
works only on the RM-node, maybe because of the missing keytab.
After copying the rm.service.keytab on all nodes the command works in the console, but the node manager fails again with the same error "Server has invalid Kerberos principal".