Created on 10-18-2015 06:04 AM - edited 09-16-2022 02:44 AM
I am working on Kerberos with AD (no local KDC) integration for multiple Hadoop clusters (Prod and DR). The goal is to have all users and service principals reside in the corporate AD. Would I need to create two separate groups of users and service ids for each cluster? The idea is to have a single userid to be able to login into Prod or DR cluster depending which one is active.
When setting up the Prod cluster with Kerberos via Ambari it will generate all necessary principals and keytabs. What happens when the second cluster (DR) needs to be configured for Kerberos? Does Ambari know that all principals already exist? or will it try to regenerate?
Created 10-18-2015 03:11 PM
Assuming this is for Ambari 2.x, when Ambari is managing Kerberos identities it is expected that each identity is unique across all clusters that share a KDC or AD. By default, Ambari provides this appending the cluster name (ideally unique across clusters sharing the same KDC or AD) to the headless principals it manages. For example the HDFS Kerberos identity typically has a principal name of hdfs-${CLUSTER_NAME}@${REALM}. So if the realm was AD.EXAMPLE.COM and there were 2 clusters that share the AD - Cluster1 and Cluster2 - then Ambari will create the following accounts in the AD: hdfs-Cluster1@AD.EXAMPLE.COM and hdfs-Cluster2@AD.EXAMPLE.COM.
This is needed so that Ambari does not change the password for one cluster's identities while performing Kerberos-related tasks for another cluster.
Ideally each cluster points to a different container (or OU), but that is for convenience. This doesn't make a difference for Ambari.
When it comes to service identities, Ambari relies on the uniqueness of the hostnames to create the service principals. This is because service principals have a hostname associated with them. For example, nn/h1c1.example.com@AD.EXAMPLE.COM.
For non-managed (user) identities, none of this makes a difference. Since Ambari does not manage these, Ambari will not change the accounts in anyway. So they are safe to manage manually and use across clusters.
Note: The Ambari-managed AD accounts need to be left alone by administrators. Since changing them may prevent Ambari or the related hadoop services from authenticating properly.
Created 10-18-2015 03:11 PM
Assuming this is for Ambari 2.x, when Ambari is managing Kerberos identities it is expected that each identity is unique across all clusters that share a KDC or AD. By default, Ambari provides this appending the cluster name (ideally unique across clusters sharing the same KDC or AD) to the headless principals it manages. For example the HDFS Kerberos identity typically has a principal name of hdfs-${CLUSTER_NAME}@${REALM}. So if the realm was AD.EXAMPLE.COM and there were 2 clusters that share the AD - Cluster1 and Cluster2 - then Ambari will create the following accounts in the AD: hdfs-Cluster1@AD.EXAMPLE.COM and hdfs-Cluster2@AD.EXAMPLE.COM.
This is needed so that Ambari does not change the password for one cluster's identities while performing Kerberos-related tasks for another cluster.
Ideally each cluster points to a different container (or OU), but that is for convenience. This doesn't make a difference for Ambari.
When it comes to service identities, Ambari relies on the uniqueness of the hostnames to create the service principals. This is because service principals have a hostname associated with them. For example, nn/h1c1.example.com@AD.EXAMPLE.COM.
For non-managed (user) identities, none of this makes a difference. Since Ambari does not manage these, Ambari will not change the accounts in anyway. So they are safe to manage manually and use across clusters.
Note: The Ambari-managed AD accounts need to be left alone by administrators. Since changing them may prevent Ambari or the related hadoop services from authenticating properly.
Created 11-02-2016 09:27 AM
Hi @dbaev, I would like to have the same scenario. 2 clusters but using the same AD and also with kerberos. How was your experience? Did you find any problems?
Thanks,
Created 03-22-2018 01:48 PM
Hi!
In the case both clusters will be configured to use a DellEMC ISILON.
On Installation Guide have a config step to remove the "-${CLUSTER_NAME}".
How to proceed in this case?
Will be possible both cluster in the same AD Domain (realm)?
From Installation Guide "Isilon-OneFS-With-Hadoop-and-Hortonworks-for-Kerberos-Installation-Guide"
"Click the General tab and configure the Apache Ambari user principals as shown in the next table. Remove -${cluster-name} from the default value and change to a value as shown in the Required value column so that it matches the service account names (users) that you created during the initial configuration of your Isilon OneFS cluster for use with Ambari and Hortonworks."