Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar

Note: Credit for the key piece of information to solve this problem goes to Phil D’Amore.

A customer had a problem writing a Java application that used the Hive client libraries to talk to two secure Hadoop clusters that resided in different Kerberos realms. The same problem could be encountered by a client connecting to a single secure Hadoop cluster that happened not to be in the Kerberos “default_realm” as specified in the client host’s krb5.conf file. The same problem could occur for any Hadoop ecosystem client, not just Hive clients.

In order to communicate with two different secure Hadoop clusters, in different Kerberos realms, the client application did the following things correctly:

  • It harvested the needed configuration files (in this case, core-site.xml, hdfs-site.xml, and hive-site.xml) from each target cluster, and used the appropriate configuration when communicating with each respective cluster.
  • Its application user id had two Kerberos principals, one registered and authenticated with each of the two KDCs, and used the appropriate principal when authenticating to each respective cluster.
  • On the client host, it had a krb5.conf file that correctly specified Kerberos kdc and admin_server values for each of the two target realms in the [realms] section, and set one of the realms as the “default_realm” in the [libdefaults] section. (It could also have set a third realm as the default_realm, it would just mean that both target clusters would be in non-default realms, which is also fine.)

However, when they ran the application, they had a puzzling problem: They were able to authenticate to the target cluster in the default realm, but failed with the target cluster in the non-default realm. Indeed, after the failure they found logs in the default_realm KDC that showed an incorrect attempt to authenticate to the wrong KDC.

They knew they had not made a coding error, because changing the default_realm to the other target cluster caused the situation to reverse. Depending on the setting of default_realm in krb5.conf file, they could talk to either cluster, but not both at once.

The problem was fixed by adding a [domain_realm] section to the krb5.conf file. It turns out that the Thrift libraries underlying the client have APIs that do not communicate the target “realm”, but only the target server. The Kerberos libraries are responsible for translating from the target server’s domain to the target realm. If the domain and the realm have identical string values (except for upper/lower case), which is common but not required, it will use that. Failing that, it will use the default realm. It will not infer from the domain of the KDC servers. In this case the domain and realm were different, so the authentication request for the non-default realm was being sent to the default realm’s KDC. Adding a [domain_realm] section to the krb5.conf file allows arbitrary mappings from target domains to target realms, so Kerberos was finally able to translate from the desired target domain to the correct target realm. See http://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html#domain-realm for details of the krb5.conf file sections and contents.

1,533 Views