Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS not staring after enabling Kerberos with AD as KDC

HDFS not staring after enabling Kerberos with AD as KDC

New Contributor

I am using CDH 5.1 and I have enabled Kerberos with AD as KDC.  Kerberos was enabled successfully but HDFS service is not starting successfully.  The namenode logs includes the following for each datanode(see below).

 

5:57:29.969 PM WARN org.apache.hadoop.security.UserGroupInformation
PriviledgedActionException as:hdfs/server999priv.pldc.co.org@TEST.MSDS.CO.ORG (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))] 

 

----

 

Secondly, the CDH5.1 documentation mentions that CM will automatically deploy the keytab files.  Where can I find the keytab files? 

 

 

 

Thank you for your help,

Ferds

7 REPLIES 7

Re: HDFS not staring after enabling Kerberos with AD as KDC

Cloudera Employee

Some standard things to check - Encryption type is correct in krb5.conf. All ADs after 2003 support "rc4-hmac" and this is present in CM by default. Also, check that from NameNode host you can resolve DN host both by name and IP address, and similar from DN hosts you can resolve NN host.

 

If all that is fine, you can add this:

 

HADOOP_OPTS="-Dsun.security.krb5.debug=true"

 

to the environment safety valve for HDFS service and restart HDFS. Then in the stdout/stderr of NameNode (can be seen from the CM web UI under Process tab) and in /var/log/hadoop-hdfs/jsvc.out (or .err) on DataNode hosts, you will see kerberos error messages about which principal is not present in kerberos database. That can help us debug what's happening.

Re: HDFS not staring after enabling Kerberos with AD as KDC

New Contributor

Have you figured this out yet? We're running into the same issue with no idea where to turn. We're starting to think it's a bug since we've verified the keytab files generated by the installation work (They are stored in /var/run/cloudera-scm-agent/process/<process> if this interests you OP).

 

Let me know if you want to compare notes also.

Re: HDFS not staring after enabling Kerberos with AD as KDC

Cloudera Employee

MrSmith - Have you already run Hosts Inspector (on "Hosts" tab) and that didn't reveal any issues with DNS/Reverse DNS lookups?

 

You should also do ldapsearch against your entire AD to check if there are any unused UPNs/SPNs anywhere. That can also prevent services to work even though kinit on individual keytabs works.

 

ldapsearch should be done for "userPrincipalName=*HOST*" and "servicePrincipalName=*HOST*". Here HOST should be replaced by every host in your cluster. You can also use some common string that is part of all your hostnames here. If you find any duplicate UPNs or SPNs, you should delete them. Then clear all accounts that Cloudera Manager generated and Regenerate all credentials.

Re: HDFS not staring after enabling Kerberos with AD as KDC

New Contributor

As far as I can tell those are non standard attributes for computer objects, does it specify somewhere in the documentation what attributes if any need to be added to computer objects? Is there a process we missed in the documentation to get UPNs / SPNs for our hosts into Active Directory?

Re: HDFS not staring after enabling Kerberos with AD as KDC

New Contributor

And yes we ran the host inspector. There are no DNS / Reverse lookup issues. There are no unused UPNs or SPNs either.

Re: HDFS not staring after enabling Kerberos with AD as KDC

New Contributor

Also after enabling debug logging we have this in jsvc.out which correlates to our original problem:

 

Found ticket for hdfs/hdpsl1.dev.dci.local@DCI.LOCAL to go to krbtgt/DCI.LOCAL@DCI.LOCAL expiring on Thu Aug 21 18:36:51 EDT 2014
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for hdfs/hdpsl1.dev.dci.local@DCI.LOCAL to go to krbtgt/DCI.LOCAL@DCI.LOCAL expiring on Thu Aug 21 18:36:51 EDT 2014
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
default etypes for default_tgs_enctypes: 23 23 18.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
>>> KrbKdcReq send: kdc=dc01.dci.local TCP:88, timeout=30000, number of retries =3, #bytes=1408
>>> KDCCommunication: kdc=dc01.dci.local TCP:88, timeout=30000,Attempt =1, #bytes=1408
>>>DEBUG: TCPClient reading 102 bytes
>>> KrbKdcReq send: #bytes read=102
>>> KdcAccessibility: remove dc01.dci.local
>>> KDCRep: init() encoding tag is 126 req type is 13
>>>KRBError:
sTime is Thu Aug 21 09:18:20 EDT 2014 1408627100000
suSec is 235534
error code is 7
error Message is Server not found in Kerberos database
realm is DCI.LOCAL
sname is hdfs/hdpmaster.dev.dci.local
msgType is 30

Re: HDFS not staring after enabling Kerberos with AD as KDC

New Contributor

We solved this issue, turns out Cloudera does not like a kerberos realm that's different from the FQDN of the Cloudera hosts ( hdps/host.dev.example.com@example.com ). Our systems live in child resource domains and our users are all in the root domain.