Support Questions

Find answers, ask questions, and share your expertise

Unable to Start DataNode in kerberos cluster

avatar
Master Collaborator

Hi Guys,

I'm unable to start DataNode after enabling the kerberos in my cluster. I tried all the suggested solutions in the community and Internet and without any success to solve it.

All other servers started and my cluster and node able to authenticate against the active directory.

Here the important config in the HDFS:

dfs.datanode.http.address 1006

dfs.datanode.address 1004

hadoop.security.authentication kerberos

hadoop.security.authorization true

hadoop.rpc.protection authentication

Enable Kerberos Authentication for HTTP Web-Consoles true

and here is the log: STARTUP_MSG: java = 1.8.0_101 ************************************************************/ 2017-10-23 06:56:02,698 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT] 2017-10-23 06:56:03,449 INFO org.apache.hadoop.security.UserGroupInformation: Login successful for user hdfs/aopr-dhc001.lpdomain.com@LPDOMAIN.COM using keytab file hdfs.keytab 2017-10-23 06:56:03,812 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2017-10-23 06:56:03,891 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2017-10-23 06:56:03,891 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2017-10-23 06:56:03,899 INFO org.apache.hadoop.hdfs.server.datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576 2017-10-23 06:56:03,900 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: File descriptor passing is enabled. 2017-10-23 06:56:03,903 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is aopr-dhc001.lpdomain.com 2017-10-23 06:56:03,908 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.lang.RuntimeException: Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP. Using privileged resources in combination with SASL RPC data transfer protection is not supported. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:1371) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1271) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:464) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2583) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2470) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2517) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2699) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2723) 2017-10-23 06:56:03,919 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2017-10-23 06:56:03,921 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at aopr-dhc001.lpdomain.com/10.16.144.131 ************************************************************/ 2017-10-23 06:56:08,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = aopr-dhc001.lpdomain.com/10.16.144.131 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.6.0-cdh5.13.0=======================

34 REPLIES 34

avatar
Master Collaborator

Hi Geoffrey,

Yes i'm using CDH but the error i'm getting is not related to CDH.

avatar
Master Mentor

@Fawze AbuJaber

Can you change the below from the current "authentication" to "privacy"

core-site.xml

hadoop.rpc.protection = privacy

hdfs-site.xml

dfs.encrypt.data.transfer=true 

Does the Cluster have custom java classes and dependences? If so include them Have a look at this jira https://issues.apache.org/jira/browse/AMBARI-8174

You may need to configure both dfs.data.transfer.protection and hadoop.rpc.protection to specify QOP for rpc and data transfer protocols. In some cases, the values for these two properties will be same. In those cases, it may be easier to allow dfs.data.transfer.protection default to hadoop.rpc.protection.This also ensures that an admin will get QOP as Authentication if admin does not specify either of those values.

The restart the datanode after the 2 changes in the core / hdfs site .xml

avatar
Master Collaborator

Tried but with no success, indeed i'm notice such error before this error and don'w know how it might be related:

KdcAccessibility: remove ropr-mng01.lpdomain.com
>>> KDCRep: init() encoding tag is 126 req type is 11
>>>KRBError:
	 sTime is Sat Oct 28 06:26:45 EDT 2017 1509186405000
	 suSec is 487082
	 error code is 25
	 error Message is Additional pre-authentication required
	 sname is krbtgt/LPDOMAIN.COM@LPDOMAIN.COM
	 eData provided.

avatar
Master Collaborator

When i disable the kerberos, all is working fine.

avatar
Master Mentor

@Fawze AbuJaber

There could be a couple of issues with your Kerberos setup.

I am not familiar with the Cloudera Manager /Kerberos wizard but I have some pointers can you share your krb5.ini or conf?

It seems your KDC does not support the encryption type requested. The desired encryption types are specified in the following tags in the Kerberos Configuration file krb5.ini or conf:

 [libdefaults]

Enable debug by running the below kinit where xxx.ktab and xxx.ktab_Principal is the principal,you can get the values using klist

kinit -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t xxx.ktab {xxx.ktab_Principal}

Please let me know

avatar
Master Collaborator

@Geoffrey Shelton Okot

supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal

I Have the following config also:

:dfs.encrypt.data.transfer.algorithm=AES/CTR/NoPadding

dfs.encrypt.data.transfer.cipher.key.bitlength=256

Kerberos Encryption Types=rc4-hmac

seems that kinit nor working in the same you in HDP:

[root@aopr-dhc001 ~]# kinit -V -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t cloudera-scm@LPDOMAIN.COM.ktab {cloudera-scm@LPDOMAIN.COM.ktab_Principal}

kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' Bad start time value un.security.krb5.debug=true kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' kinit: invalid option -- 'j' kinit: invalid option -- '.' Bad start time value ecurity.debug=true Usage: kinit [-V] [-l lifetime] [-s start_time] [-r renewable_life] [-f | -F] [-p | -P] -n [-a | -A] [-C] [-E] [-v] [-R] [-k [-t keytab_file]] [-c cachename] [-S service_name] [-T ticket_armor_cache] [-X <attribute>[=<value>]] [principal] options: -V verbose -l lifetime -s start time -r renewable lifetime -f forwardable -F not forwardable -p proxiable -P not proxiable -n anonymous -a include addresses -A do not include addresses -v validate -R renew -C canonicalize -E client is enterprise principal name -k use keytab -t filename of keytab to use -c Kerberos 5 cache name -S service -T armor credential cache -X <attribute>[=<value>]

avatar
Master Mentor

@Fawze AbuJaber

Please do this instead the previous {.......} was an example, sorry I didn't elaborate!

kinit -V -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t cloudera-scm@LPDOMAIN.COM.ktab cloudera-scm@LPDOMAIN.COM.ktab_Principal

And can you attach the krb5.conf (Linux) and krb5.ini (windows) I need to see what values you have in there.

avatar
Master Collaborator

@Geoffrey Shelton Okot

[root@aopr-dhc001 ~]# cat /etc/krb5.conf

[libdefaults]

default_realm = LPDOMAIN.COM

dns_lookup_kdc = true

dns_lookup_realm = false

ticket_lifetime = 86400

renew_lifetime = 604800

forwardable = true

default_tgs_enctypes = rc4-hmac

default_tkt_enctypes = rc4-hmac

permitted_enctypes = rc4-hmac

udp_preference_limit = 1

kdc_timeout = 5000

supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal

[realms] LPDOMAIN.COM = { kdc = ropr-mng01.lpdomain.com

admin_server = ropr-mng01.lpdomain.com }

[domain_realm]

avatar
Master Collaborator

@Geoffrey Shelton Okot

[root@aopr-dhc001 ~]# kinit -V -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t cloudera-scm@LPDOMAIN.COM.ktab cloudera-scm@LPDOMAIN.COM.ktab_Principal

kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' Bad start time value un.security.krb5.debug=true kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' kinit: invalid option -- 'j' kinit: invalid option -- '.' Bad start time value ecurity.debug=true Usage: kinit [-V] [-l lifetime] [-s start_time] [-r renewable_life] [-f | -F] [-p | -P] -n [-a | -A] [-C] [-E] [-v] [-R] [-k [-t keytab_file]] [-c cachename] [-S service_name] [-T ticket_armor_cache] [-X <attribute>[=<value>]] [principal] options: -V verbose -l lifetime -s start time -r renewable lifetime -f forwardable -F not forwardable -p proxiable -P not proxiable -n anonymous -a include addresses -A do not include addresses -v validate -R renew -C canonicalize -E client is enterprise principal name -k use keytab -t filename of keytab to use -c Kerberos 5 cache name -S service -T armor credential cache

avatar
Master Mentor

@Fawze AbuJaber

Can you make a backup and replace your krb5.conf with this file below please notice the difference! Can you make sure the supported_enctypes match your AD encryption ?

[libdefaults]
  default_realm = LPDOMAIN.COM
  dns_lookup_kdc = true
  dns_lookup_realm = false
  ticket_lifetime = 86400
  renew_lifetime = 604800
  forwardable = true
  default_tgs_enctypes = rc4-hmac
  default_tkt_enctypes = rc4-hmac
  permitted_enctypes = rc4-hmac
  udp_preference_limit = 1
  kdc_timeout = 5000
  supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
[domain_realm]
  lpdomain.com = LPDOMAIN.COM
  .lpdomain.com = LPDOMAIN.COM
[realms] 
  LPDOMAIN.COM = { 
  kdc = ropr-mng01.lpdomain.com
  admin_server = ropr-mng01.lpdomain.com 
  }
[domain_realm]
  lpdomain.com = LPDOMAIN.COM
  .lpdomain.com = LPDOMAIN.COM

BRB