Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is it recommended to use an existing KDC on an AD Domain Controller or is it recommended to setup a new MIT KDC on a HDP host

avatar
Super Collaborator
 
1 ACCEPTED SOLUTION

avatar

I would recommend setting up an MIT KDC for the cluster's service identities. If there are users in an Active Directory that need access to the services on the cluster, then a trust relationship between the cluster's KDC and the Active Directory can be setup to give those users access.

This configuration lends itself to spreading the Kerberos-related workload across different KDCs - the service-related Kerberos workload remains local to the cluster while the user-related Kerberos workload is sent to the Active Directory. This reduces the load on the Active Directory; and depending on the network configuration, this can contain much of the network traffic to be local to the cluster.

View solution in original post

8 REPLIES 8

avatar
Super Guru

The answer to this question depends on the customers needs. In many cases, you would use an existing Microsoft AD implementation as the end-users are most likely using that to access the network anyway.

You can implement a dedicated KDC, but it will add complexity.

avatar

I would recommend setting up an MIT KDC for the cluster's service identities. If there are users in an Active Directory that need access to the services on the cluster, then a trust relationship between the cluster's KDC and the Active Directory can be setup to give those users access.

This configuration lends itself to spreading the Kerberos-related workload across different KDCs - the service-related Kerberos workload remains local to the cluster while the user-related Kerberos workload is sent to the Active Directory. This reduces the load on the Active Directory; and depending on the network configuration, this can contain much of the network traffic to be local to the cluster.

avatar
Super Guru

This makes a lot of sense!

avatar
Super Collaborator

Thanks Robert, This makes sense. Does it increase the complexity for the install? Also what versions are recommended, is there any issues with versions..

avatar

@Satish Bomma

I think the installation for this is not too complex. The MIT KDC install easy with potentially little to do - depending on your security needs. Once the KDC is installed, and admin user needs to be created and you are done. Here is a script for Centos that installs a KDC - install-kdcsh.txt

Then the trust relationship need to be created. This is about 3 command line calls on the Active Directory host and 1 on the MIT KDC host.

Finally Ambari need to know about the Active Directory as well as the local KDC. This is done by adding a _realm_ block to the krb5.conf template and adding the Active Directory's realm to a text box labeled "Additional Realms" while enabling Kerberos.

avatar
Super Collaborator

@Robert Levas

Thank you so much.

avatar
Rising Star

@Robert Levas

Would you recommend setting up the MIT KDC on it's own dedicated VM or on one of the masters? What kind of resources does it require?

avatar
@Eric Hanson

I don't have an official opinion on this. It really depends on the available resources.

If the cluster is really large, then it may be beneficial to put the KDC on its own VM; but for a small cluster (<15 hosts), that may be a bit overkill and the least utilized host for the KDC maybe sufficient. That said, the workload could be spread out by placing a one or more slave KDCs around the cluster, There is also the option to separate the kadmin and krb5kdc processes to different hosts - though this is more for security concerns than for performance or resource concerns.

One thing to keep in mind. For Ambari server versions 2.5.0 and below, it appears that the cluster does an abnormal amount of kinit's. This is currently being looked into. So far, it is unclear whether this is a bug, expected behavior, or something in between. The effect of this issue on a small cluster is minimal and not noticeable over a short period of time. On a large cluster (say 900 nodes), the Kerberos log files tend to get large quickly. Performance of the KDC on such a cluster, even when the KDC exists on a host with Hadoop services, does not appear to be affected. The main issue is merely log file size. However, if an issue is found and fixed, less kinit's couldn't hurt. 🙂