Created on 02-22-2016 10:34 PM - edited on 02-12-2020 04:23 AM by VidyaSargur
Kerberos cross realm trust for distcp
This article is to demonstrate how to setup cross realm trust for distcp between two secure HDP clusters with their own Kerberos realms(KDC’s).
Prerequisites
Lets assume first HDP DEV cluster realm : HDPDEV.DEV.COM
Lets assume second HDP QA cluster realm : HDPQA.QA.COM
Step 1 :
To set up cross realm trust between HDPDEV.DEV.COM and HDPQA.QA.COM, for example a client of realm HDPDEV.DEV.COM to access a service in realm HDPDQA.QA.COM, both realms must share a key for a principal name krbtgt/ HDPDQA.QA.COM@ HDPDEV.DEV.COM and both keys must have the same key version number associated with them.
Cross realm trust is unidirectional by default. So for clients in HDPQA.QA.COM also to have access services in HDPDEV.DEV.com, both realms must share a key for principal krbtgt/ HDPDDEV.DEV.COM@ HDPQA.QA.COM.
Add both krbtgt principals on both clusters
#HDP DEV Cluster
kadmin.local : addprinc krbtgt/ HDPDQA.QA.COM@ HDPDEV.DEV.COM
kadmin.local : addprinc krbtgt/ HDPDDEV.DEV.COM@ HDPQA.QA.COM
#HDP QA cluster
Kadmin.local : addprinc krbtgt/ HDPDQA.QA.COM@ HDPDEV.QA.COM
kadmin.local : addprinc krbtgt/ HDPDDEV.DEV.COM@ HDPQA.QA.COM
Note: On both clusters verify both entries have matching kvno and encryption types using kadmin.local : getprinc <principal_name>.
Step 2:
Next step is to set hadoop.security.auth_to_local parameter in both clusters. This parameter helps to map the principal to user. One issue here is that the SASL RPC client requires that the remote server’s Kerberos principal must match the server principal in its own configuration. Therefore, the same principal name must be assigned to the applicable NameNodes in the source and the destination cluster. For example, if the Kerberos principal name of the NameNode in the source cluster is nn/host1@HDPDDEV.DEV.COM, the Kerberos principal name of the NameNode in destination cluster must be nn/host2@HDPDQA.QA.COM, rather than nn2/host2@realm, for example
In Dev cluster add :
<property> <name>hadoop.security.auth_to_local</name> <value> RULE:[2:$1@$0](nn@.*HDPQA.QA.COM s/@.*/hdfs/ RULE:[2:$1@$0](rm@.*HDPDQA.QA.COM s/@.*/yarn/ RULE:[1:$1@$0](.*@HDPDQA.QA.COM)s/@.*// RULE:[2:$1@$0](.*@HDPDQA.QA.COM s/@.*// </value> </property>
In QA cluster add :
<property> <name>hadoop.security.auth_to_local</name> <value> RULE:[2:$1@$0](nn@.*HDPDEV.DEV.COM s/@.*/hdfs/ RULE:[2:$1@$0](rm@.*HDPDDEV.DEV.COM s/@.*/yarn/ RULE:[1:$1@$0](.*@HDPDDEV.DEV.COM)s/@.*// RULE:[2:$1@$0](.*@HDPDEV.DEV.COM s/@.*// </value> </property>
To test the mapping, use org.apache.hadoop.security.HadoopKerberosName.
For example,
[root@localhost]$ hadoop org.apache.hadoop.security.HadoopKerberosName nn/localhost@HDPDEV.DEV.COM
Name: nn/localhost@HDPDEV.DEV.COM to hdfs
Step 3:
Configure complex trust relationships. There are two ways to do it. One way is to configure a shared hierarchy of names. This is the default and simple method. The other way is to explicitly change capaths section in krb5.conf file. This is complicated but more flexible.
Configure paths in krb5.conf :
Configure the capaths section of /etc/krb5.conf, so that clients which have credentials for one realm will be able to look up which realm is next in the chain which will eventually lead to the being able to authenticate to servers.
Edit the /etc/krb5.conf files on both clusters (all nodes) to map the domain to the realm.
For example,
In Dev Cluster :
[capaths] HDPDDEV.DEV.COM ={ HDPDQA.QA.COM = . }
In QA cluster:
[capaths] HDPDQA.QA.COM = { HDPDDEV.DEV.COM = . }
The value “.” is used if there are no intermediate realms.
Step 4 :
Set dfs.namenode.kerberos.principal.patternparameter in hdfs-site.xml to *. This is a client-side RegEx that can be configured to control allowed realms to authenticate with.
If this parameter is not set,
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: nn/hdm1.qa.com@HDP.DEV.COM; Host Details : local host is: "sdw1.dev.com/10.181.22.130"; destination host is: "hdm1.qa.com":8020;
Step 5 :
Test trust is setup by running hdfs commands from DEV cluster to QA cluster and vice versa.
Example:
On the DEV cluster, kinit userA@HDPDEV.DEV.COM and then issue hdfs commands:
hdfs dfs –ls hdfs://<NameNode_FQDN_forQACluster>:8020/tmp hdfs dfs -put /tmp/test.txt hdfs://<NameNode_FQDN_forQACluster>:8020/tmp
Do a similar test on QA cluster.
Step 6 :
Running distcp to copy a file from DEV to QA cluster
hadoop distcp hdfs:// <NameNode_FQDN_forDEVCluster>:8020/tmp/test.txt hdfs://<NameNode_FQDN_forQACluster>:8020/tmp/
Created on 08-09-2016 11:40 AM
hadoop.security.auth_to_local rules are missing )
Please update the same.
Also if I run below from qa cluster I get wrong mapping.
hadoop org.apache.hadoop.security.HadoopKerberosName nn/localhost@HDPDEV.DEV.COM
Name: nn/localhost@HDPDEV.DEV.COM to nnhdfs
Created on 02-14-2017 03:04 AM
One question please:
-----------------------
#HDP QA cluster
Kadmin.local : addprinc krbtgt/ HDPDQA.QA.COM@ HDPDEV.QA.COM
-----------------------
Is above needed? or correct?
Created on 02-17-2020 05:38 AM
what kind of role does krbtgt/ HDPDQA.QA.COM@ HDPDEV.DEV.COM play during the process of authentication?
Created on 08-23-2020 05:50 AM
After follow the exact steps which mentioned in this blog, I got below exception in ranger kms log and distcp job was failed.
org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to hdfs-dev_cluster@BDPDEV.GE.NET
Added the same rules in advanced kms-site config under hadoop.kms.authentication.kerberos.name.rules property which we added in advanced hdfs core-site config. Now I am able to run the distcp job successfully.
Created on 01-24-2022 12:56 AM - edited 01-24-2022 12:57 AM
Hi,
Do you think there is a typo in the realms below which is given in the article :
#HDP DEV Cluster
kadmin.local : addprinc krbtgt/ HDPDQA.QA.COM@ HDPDEV.DEV.COM
kadmin.local : addprinc krbtgt/ HDPDDEV.DEV.COM@ HDPQA.QA.COM
#HDP QA cluster
Kadmin.local : addprinc krbtgt/ HDPDQA.QA.COM@ HDPDEV.QA.COM
kadmin.local : addprinc krbtgt/ HDPDDEV.DEV.COM@ HDPQA.QA.COM
Created on 05-28-2022 11:27 AM
It's seems without space.
#HDP DEV Cluster
kadmin.local : addprinc krbtgt/HDPDQA.QA.COM@HDPDEV.DEV.COM
kadmin.local : addprinc krbtgt/HDPDDEV.DEV.COM@HDPQA.QA.COM
#HDP QA cluster
Kadmin.local : addprinc krbtgt/HDPDQA.QA.COM@HDPDEV.QA.COM
kadmin.local : addprinc krbtgt/HDPDDEV.DEV.COM@HDPQA.QA.COM