Support Questions

Find answers, ask questions, and share your expertise

AD/MIT Cross Realm Trust Not Working With Cloudera

avatar
New Contributor

I am attempting to enable Kerberos authentication for users, against an Active Directory based realm. I am following the model of having a MIT KDC to house Cloudera principals, and then establishing a Cross Realm Trust to the AD realm, to allow AD users to authenticate.

 

At a purely Kerberos level, this is working fine (see the example below), however when I attempt a cluster operation that requires Kerberos authentication, I see consistent and fairly general failures. I believe I have followed the AD integration docs, with the exception of the final part around configuring name translation. I don't think I have reached the stage in authentication where this is playing a role.

 

SITE.PRODUCT.COMPANY.ORG: Cluster KRB Realm

SITE.COMPANY.ORG: Active Directory KRB Realm

product.company.org: DNS domain of all cluster nodes

auser: the test user account

 

If it has any relevance, the krb5.conf configuration for the AD Realm (and wider OS level AD integration) is managed by the Linux `realm` command, which utilises the underliying `ad-cli` command and has therefore been auto-populated.

 

This demonstrates that the cross realm trust is working at a Kerberos level:

 

==============================================================

 

[deployer@test-edge-01 ~]$ kinit auser@SITE.COMPANY.ORG
Password for auser@SITE.COMPANY.ORG:

 

[deployer@test-edge-01 ~]$ kvno hdfs/test-data-01.product.company.org@SITE.PRODUCT.COMPANY.ORG
hdfs/test-data-01.product.company.org@SITE.PRODUCT.COMPANY.ORG: kvno = 2

 

[deployer@test-edge-01 ~]$ klist -e
Ticket cache: FILE:/tmp/krb5cc_997
Default principal: auser@SITE.COMPANY.ORG

 

Valid starting Expires Service principal
13/03/19 09:10:57 13/03/19 19:10:57 krbtgt/SITE.COMPANY.ORG@SITE.COMPANY.ORG
renew until 20/03/19 09:10:43, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
13/03/19 09:11:22 13/03/19 19:10:57 krbtgt/SITE.PRODUCT.COMPANY.ORG@SITE.COMPANY.ORG
renew until 20/03/19 09:10:43, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
13/03/19 09:11:22 13/03/19 19:10:57 hdfs/test-data-01.product.company.org@SITE.PRODUCT.COMPANY.ORG
renew until 18/03/19 09:11:22, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

 

==============================================================

 

 

However, this is what happens when I try to run a cluster operation that requires authentication (based on the ticket granted in the `kinit` above):

 

 

==============================================================

 

[deployer@test-edge-01 ~]$ export HADOOP_OPTS="-Dsun.security.krb5.debug=true"

 

[deployer@test-edge-01 ~]$ hdfs dfs -ls hdfs://test-master-02.product.company.org:8020/
Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
>>>KinitOptions cache name is /tmp/krb5cc_997
>>>DEBUG <CCacheInputStream> client principal is auser@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> server principal is krbtgt/SITE.COMPANY.ORG@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> key type: 18
>>>DEBUG <CCacheInputStream> auth time: Wed Mar 13 09:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> start time: Wed Mar 13 09:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> end time: Wed Mar 13 19:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> renew_till time: Wed Mar 20 09:10:43 UTC 2019
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH;
>>>DEBUG <CCacheInputStream> client principal is auser@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> server principal is X-CACHECONF:/krb5_ccache_conf_data/pa_type/krbtgt/SITE.COMPANY.ORG@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> key type: 0
>>>DEBUG <CCacheInputStream> auth time: Thu Jan 01 00:00:00 UTC 1970
>>>DEBUG <CCacheInputStream> start time: null
>>>DEBUG <CCacheInputStream> end time: Thu Jan 01 00:00:00 UTC 1970
>>>DEBUG <CCacheInputStream> renew_till time: null
>>> CCacheInputStream: readFlags()
>>>DEBUG <CCacheInputStream> client principal is auser@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> server principal is krbtgt/SITE.PRODUCT.COMPANY.ORG@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> key type: 18
>>>DEBUG <CCacheInputStream> auth time: Wed Mar 13 09:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> start time: Wed Mar 13 09:11:22 UTC 2019
>>>DEBUG <CCacheInputStream> end time: Wed Mar 13 19:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> renew_till time: Wed Mar 20 09:10:43 UTC 2019
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; PRE_AUTH;
>>>DEBUG <CCacheInputStream> client principal is auser@SITE.COMPANY.ORG
>>>DEBUG <CCacheInputStream> server principal is hdfs/test-data-01.product.company.org@SITE.PRODUCT.COMPANY.ORG
>>>DEBUG <CCacheInputStream> key type: 18
>>>DEBUG <CCacheInputStream> auth time: Wed Mar 13 09:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> start time: Wed Mar 13 09:11:22 UTC 2019
>>>DEBUG <CCacheInputStream> end time: Wed Mar 13 19:10:57 UTC 2019
>>>DEBUG <CCacheInputStream> renew_till time: Mon Mar 18 09:11:22 UTC 2019
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; PRE_AUTH;
Found ticket for auser@SITE.COMPANY.ORG to go to krbtgt/SITE.COMPANY.ORG@SITE.COMPANY.ORG expiring on Wed Mar 13 19:10:57 UTC 2019
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for auser@SITE.COMPANY.ORG to go to krbtgt/SITE.COMPANY.ORG@SITE.COMPANY.ORG expiring on Wed Mar 13 19:10:57 UTC 2019
Service ticket not found in the subject
>>> Realm doInitialParse: cRealm=[SITE.COMPANY.ORG], sRealm=[SITE.PRODUCT.COMPANY.ORG]
>>> Realm parseCapaths: no cfg entry
>>> Credentials acquireServiceCreds: main loop: [0] tempService=krbtgt/SITE.PRODUCT.COMPANY.ORG@SITE.COMPANY.ORG
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 18 17 16 23 1 3.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
>>> KdcAccessibility: reset
>>> Credentials acquireServiceCreds: no tgt; searching backwards
>>> Credentials acquireServiceCreds: inner loop: [1] tempService=krbtgt/COMPANY.ORG@SITE.COMPANY.ORG
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 18 17 16 23 1 3.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
>>> Credentials acquireServiceCreds: inner loop: [2] tempService=krbtgt/PRODUCT.COMPANY.ORG@SITE.COMPANY.ORG
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 18 17 16 23 1 3.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
>>> Credentials acquireServiceCreds: no tgt; cannot get creds
KrbException: Fail to create credential. (63) - No service creds
  at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:299)
  at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:454)
  at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:641)
  at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
  at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
  at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
  at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
  at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:560)
  at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:375)
  at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:730)
  at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:726)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
  at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)
  at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1524)
  at org.apache.hadoop.ipc.Client.call(Client.java:1447)
  at org.apache.hadoop.ipc.Client.call(Client.java:1408)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:762)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
  at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2121)
  at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1215)
  at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
  at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)
  at org.apache.hadoop.fs.Globber.doGlob(Globber.java:285)
  at org.apache.hadoop.fs.Globber.glob(Globber.java:151)
  at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1639)
  at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326)
  at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
  at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
  at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102)
  at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
  at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
  at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)

19/03/13 09:13:19 WARN security.UserGroupInformation: PriviledgedActionException as:auser@SITE.COMPANY.ORG (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Fail to create credential. (63) - No service creds)]

 

 

==============================================================

 

 

The output is actually truncated, but as far as I can see, is a repetition of the above.

 

Things that I have considered:

 

  • Ensured that 'SITE.PRODUCT.COMPANY.ORG' & 'SITE.COMPANY.ORG' are added to the 'Trusted Kerberos Realms' HDFS Service Wide configuration. Also ensure that 'dfs.namenode.kerberos.principal.pattern' with a value of '*' is added to the HDFS Client Advanced Configuration Snippet  option in the HDFS Gateway scope of its configuration. Cluster restarted, for no change in symptoms.
  • Encryption Types. The corresponding krbtgt principals in both realms have the same enc types enabled (namely aes256-cts-hmac-sha1-96, aes128-cts-hmac-sha1-96, arcfour-hmac). The cluster as inherited only had 'rc4-hmac' enabled in the 'Kerberos Encryption Types' within the Kerberos settings of Cloudera Manager, but I have since added 'aes256-cts-hmac-sha1-96' and restarted the cluster, for no change.
  • Java Export Policy. We are using JDK 1.7. I have downloaded the unlimited export policy files and compared them to those in our installation and confirmed they are the same, thinking that perhaps AES256 was not enabled within Java.

I am running out of ideas. Can anybody suggest what else I should be checking/may be missing?

 

Many thanks.

1 ACCEPTED SOLUTION

avatar
New Contributor

Looks like I have got to the bottom of this and it is rooted in the way that realm/ad-cli/sssd manages configuration of servers, that are members of an AD domain.

 

We join machines to the domain using `realm join ...`. This convenient command takes care of creating a machine account in the domain and then managing all the config files that need amending on the host being joined (sssd.conf, krb5.conf, PAM, etc).

 

Specifically, for the krb5.conf it adds an empty block for the realm:

 

[realms]

  SITE.COMPANY.ORG = {
  }

and an include statement:

 

includedir /var/lib/sss/pubconf/krb5.include.d/

within this included directory are a series of config fragments that take of the actual configuration. A nice, manageable way of handling complex configuration, such as when you have multiple realms.

 

Java 1.7 does not support include directives in krb5.conf.

 

Therefore native Kerberos operations work fine with this config, however Cloudera is unable to.

 

Workaround is to explicitly add 'kdc' and 'admin_server' directives within the empty realm block.

View solution in original post

2 REPLIES 2

avatar
New Contributor

Looks like I have got to the bottom of this and it is rooted in the way that realm/ad-cli/sssd manages configuration of servers, that are members of an AD domain.

 

We join machines to the domain using `realm join ...`. This convenient command takes care of creating a machine account in the domain and then managing all the config files that need amending on the host being joined (sssd.conf, krb5.conf, PAM, etc).

 

Specifically, for the krb5.conf it adds an empty block for the realm:

 

[realms]

  SITE.COMPANY.ORG = {
  }

and an include statement:

 

includedir /var/lib/sss/pubconf/krb5.include.d/

within this included directory are a series of config fragments that take of the actual configuration. A nice, manageable way of handling complex configuration, such as when you have multiple realms.

 

Java 1.7 does not support include directives in krb5.conf.

 

Therefore native Kerberos operations work fine with this config, however Cloudera is unable to.

 

Workaround is to explicitly add 'kdc' and 'admin_server' directives within the empty realm block.

avatar
Explorer

if you use MIT kerberos server or Freeipa so 'kdc' it is bad workaround because you should make HA for kerberos using DNS as balancing for KDC servers.

 

You need to switch on dns_lookup_kdc=true and it will discover any external realms so if that realm have a trust (for example two-ways) you can use direct connection to any external KDC to get TGT and then to ask TGS from your realm service or to get TGT in your realm and to connect external server with TGS for that service.

 

Java (not Hadoop) doesn't support included configs but when you use execute authentication class the processing goes over sssd that use config to get KDC info.

However if your Active Directory domain is second domain level but MIT has the third domain level you will look conflict for routing becase all of you internal realm request will go to AD.

It can be solved by adding routing in krb5.conf to [domain_realm] section like:

 

[domain_realm]

mit.domain.local = MIT.DOMAIN.LOCAL

.mit.domain.local = MIT.DOMAIN.LOCAL

host.mit.domain.local = MIT.DOMAIN.LOCAL

domain.local = AD.DOMAIN.LOCAL

.domain.local = AD.DOMAIN.LOCAL