Created on 10-27-2017 04:36 PM - edited 09-16-2022 05:27 AM
Hi Guys,
I'm unable to start DataNode after enabling the kerberos in my cluster. I tried all the suggested solutions in the community and Internet and without any success to solve it.
All other servers started and my cluster and node able to authenticate against the active directory.
Here the important config in the HDFS:
dfs.datanode.http.address 1006
dfs.datanode.address 1004
hadoop.security.authentication kerberos
hadoop.security.authorization true
hadoop.rpc.protection authentication
Enable Kerberos Authentication for HTTP Web-Consoles true
and here is the log: STARTUP_MSG: java = 1.8.0_101 ************************************************************/ 2017-10-23 06:56:02,698 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT] 2017-10-23 06:56:03,449 INFO org.apache.hadoop.security.UserGroupInformation: Login successful for user hdfs/aopr-dhc001.lpdomain.com@LPDOMAIN.COM using keytab file hdfs.keytab 2017-10-23 06:56:03,812 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2017-10-23 06:56:03,891 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2017-10-23 06:56:03,891 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2017-10-23 06:56:03,899 INFO org.apache.hadoop.hdfs.server.datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576 2017-10-23 06:56:03,900 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: File descriptor passing is enabled. 2017-10-23 06:56:03,903 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is aopr-dhc001.lpdomain.com 2017-10-23 06:56:03,908 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.lang.RuntimeException: Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP. Using privileged resources in combination with SASL RPC data transfer protection is not supported. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:1371) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1271) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:464) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2583) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2470) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2517) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2699) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2723) 2017-10-23 06:56:03,919 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2017-10-23 06:56:03,921 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at aopr-dhc001.lpdomain.com/10.16.144.131 ************************************************************/ 2017-10-23 06:56:08,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = aopr-dhc001.lpdomain.com/10.16.144.131 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.6.0-cdh5.13.0=======================
Created 10-28-2017 08:48 AM
Hi Geoffrey,
Yes i'm using CDH but the error i'm getting is not related to CDH.
Created 10-28-2017 10:03 AM
Can you change the below from the current "authentication" to "privacy"
core-site.xml
hadoop.rpc.protection = privacy
hdfs-site.xml
dfs.encrypt.data.transfer=true
Does the Cluster have custom java classes and dependences? If so include them Have a look at this jira https://issues.apache.org/jira/browse/AMBARI-8174
You may need to configure both dfs.data.transfer.protection and hadoop.rpc.protection to specify QOP for rpc and data transfer protocols. In some cases, the values for these two properties will be same. In those cases, it may be easier to allow dfs.data.transfer.protection default to hadoop.rpc.protection.This also ensures that an admin will get QOP as Authentication if admin does not specify either of those values.
The restart the datanode after the 2 changes in the core / hdfs site .xml
Created 10-28-2017 10:31 AM
Tried but with no success, indeed i'm notice such error before this error and don'w know how it might be related:
KdcAccessibility: remove ropr-mng01.lpdomain.com >>> KDCRep: init() encoding tag is 126 req type is 11 >>>KRBError: sTime is Sat Oct 28 06:26:45 EDT 2017 1509186405000 suSec is 487082 error code is 25 error Message is Additional pre-authentication required sname is krbtgt/LPDOMAIN.COM@LPDOMAIN.COM eData provided.
Created 10-28-2017 10:33 AM
When i disable the kerberos, all is working fine.
Created 10-28-2017 11:44 AM
There could be a couple of issues with your Kerberos setup.
I am not familiar with the Cloudera Manager /Kerberos wizard but I have some pointers can you share your krb5.ini or conf?
It seems your KDC does not support the encryption type requested. The desired encryption types are specified in the following tags in the Kerberos Configuration file krb5.ini or conf:
[libdefaults]
Enable debug by running the below kinit where xxx.ktab and xxx.ktab_Principal is the principal,you can get the values using klist
kinit -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t xxx.ktab {xxx.ktab_Principal}
Please let me know
Created 10-28-2017 01:12 PM
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
I Have the following config also:
:dfs.encrypt.data.transfer.algorithm=AES/CTR/NoPadding
dfs.encrypt.data.transfer.cipher.key.bitlength=256
Kerberos Encryption Types=rc4-hmac
seems that kinit nor working in the same you in HDP:
[root@aopr-dhc001 ~]# kinit -V -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t cloudera-scm@LPDOMAIN.COM.ktab {cloudera-scm@LPDOMAIN.COM.ktab_Principal}
kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' Bad start time value un.security.krb5.debug=true kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' kinit: invalid option -- 'j' kinit: invalid option -- '.' Bad start time value ecurity.debug=true Usage: kinit [-V] [-l lifetime] [-s start_time] [-r renewable_life] [-f | -F] [-p | -P] -n [-a | -A] [-C] [-E] [-v] [-R] [-k [-t keytab_file]] [-c cachename] [-S service_name] [-T ticket_armor_cache] [-X <attribute>[=<value>]] [principal] options: -V verbose -l lifetime -s start time -r renewable lifetime -f forwardable -F not forwardable -p proxiable -P not proxiable -n anonymous -a include addresses -A do not include addresses -v validate -R renew -C canonicalize -E client is enterprise principal name -k use keytab -t filename of keytab to use -c Kerberos 5 cache name -S service -T armor credential cache -X <attribute>[=<value>]
Created 10-28-2017 01:28 PM
Please do this instead the previous {.......} was an example, sorry I didn't elaborate!
kinit -V -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t cloudera-scm@LPDOMAIN.COM.ktab cloudera-scm@LPDOMAIN.COM.ktab_Principal
And can you attach the krb5.conf (Linux) and krb5.ini (windows) I need to see what values you have in there.
Created 10-28-2017 01:32 PM
[root@aopr-dhc001 ~]# cat /etc/krb5.conf
[libdefaults]
default_realm = LPDOMAIN.COM
dns_lookup_kdc = true
dns_lookup_realm = false
ticket_lifetime = 86400
renew_lifetime = 604800
forwardable = true
default_tgs_enctypes = rc4-hmac
default_tkt_enctypes = rc4-hmac
permitted_enctypes = rc4-hmac
udp_preference_limit = 1
kdc_timeout = 5000
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
[realms] LPDOMAIN.COM = { kdc = ropr-mng01.lpdomain.com
admin_server = ropr-mng01.lpdomain.com }
[domain_realm]
Created 10-31-2017 10:19 PM
[root@aopr-dhc001 ~]# kinit -V -J-Dsun.security.krb5.debug=true -J-Djava.security.debug=true -k -t cloudera-scm@LPDOMAIN.COM.ktab cloudera-scm@LPDOMAIN.COM.ktab_Principal
kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' Bad start time value un.security.krb5.debug=true kinit: invalid option -- 'J' kinit: invalid option -- '-' kinit: invalid option -- 'D' kinit: invalid option -- 'j' kinit: invalid option -- '.' Bad start time value ecurity.debug=true Usage: kinit [-V] [-l lifetime] [-s start_time] [-r renewable_life] [-f | -F] [-p | -P] -n [-a | -A] [-C] [-E] [-v] [-R] [-k [-t keytab_file]] [-c cachename] [-S service_name] [-T ticket_armor_cache] [-X <attribute>[=<value>]] [principal] options: -V verbose -l lifetime -s start time -r renewable lifetime -f forwardable -F not forwardable -p proxiable -P not proxiable -n anonymous -a include addresses -A do not include addresses -v validate -R renew -C canonicalize -E client is enterprise principal name -k use keytab -t filename of keytab to use -c Kerberos 5 cache name -S service -T armor credential cache
Created 10-28-2017 01:52 PM
Can you make a backup and replace your krb5.conf with this file below please notice the difference! Can you make sure the supported_enctypes match your AD encryption ?
[libdefaults] default_realm = LPDOMAIN.COM dns_lookup_kdc = true dns_lookup_realm = false ticket_lifetime = 86400 renew_lifetime = 604800 forwardable = true default_tgs_enctypes = rc4-hmac default_tkt_enctypes = rc4-hmac permitted_enctypes = rc4-hmac udp_preference_limit = 1 kdc_timeout = 5000 supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal [domain_realm] lpdomain.com = LPDOMAIN.COM .lpdomain.com = LPDOMAIN.COM [realms] LPDOMAIN.COM = { kdc = ropr-mng01.lpdomain.com admin_server = ropr-mng01.lpdomain.com } [domain_realm] lpdomain.com = LPDOMAIN.COM .lpdomain.com = LPDOMAIN.COM
BRB