Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Super Guru

Some time ago I faced an interesting problem with a cluster failing to start after I replaced an MIT KDC with a FreeIPA KDC.

 
For the replacement, I installed the ipa-client package on the cluster nodes, and then, changed the KDC configuration in Cloudera Manager (CM) (changed realm and KDC details, imported Kerberos user, re-generated credentials, etc..)
 
The cluster refused to start. Besides that, CM's KDC Login monitor kept complaining about the KDC not being healthy. I could manually kinit successfully, though, and there seemed to be no KDC problems at first glance.
 
After enabling debug at different places I saw that there were socket timeouts when processes tried to connect to the KDC and that those processes were actually trying to connect to the KDC over UDP, rather than TCP. The UDP requests explained the problem, since UDP traffic was blocked between the cluster and the KDC.
 
What's strange, though, is that the krb5.conf created by the ipa-client install had the following configuration:
udp_preference_limit = 0
According to the MIT documentation, this should force all the communication to be over TCP, instead of UDP. From the MIT website:
 
"When sending a message to the KDC, the library will try using TCP before UDP if the size of the message is above udp_preference_limit. If the message is smaller than udp_preference_limit, then UDP will be tried before TCP. Regardless of the size, both protocols will be tried if the first attempt fails."
 
Even though the "library" above doesn't refer to the Java library, Java does recognize the udp_preference_limit parameter from the krb5.conf, as explained here.
 
So, I'd expect that, with that setting, TCP would be tried first for all requests, but it was not. And after 3 UDP attempts, the connection would actually fail altogether without trying to connect over TCP.
 
I found it interesting, though, that the ipa-client installation set that value to 0. At Cloudera, we have always recommended to customers to set it to 1 instead. So I went ahead and changed the entry in the krb5.conf to:
udp_preference_limit = 1
And amazingly everything worked after that!! The debug logs didn't show traces of UDP requests any longer, the cluster came up correctly and the CM alerts went away.
 
Interesting how something really small can badly break things leaving very little vestiges of what's going on...
The JDK behavior is coming from this.
So, in short, to be on the safe side always set udp_preference_limit to 1 and never to 0.
5,845 Views