Reply
Explorer
Posts: 46
Registered: ‎03-25-2017
Accepted Solution

Data node failing after enabling kerberos

After enabling kerberos datanode started failing to connect the namenode

 

Error in datanode log:

WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs/hdp-3.com@CDH.HDP (auth:KERBEROS) cause:java.io.IOException: Couldn't setup connection for hdfs/hdp-3.com@CDH.HDP to hdp-1.com/192.1.1.1:8022

 

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: hdp-1.com/192.1.1.1:8022

 

WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs/hdp-3..com@CDH.HDP (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Ticket expired (32) - PROCESS_TGS)]

 

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:

java.lang.Exception

 

 

krb.conf

cat /etc/krb5.conf

[libdefaults]

default_realm = CDH.HDP

dns_lookup_kdc = false

dns_lookup_realm = false

ticket_lifetime = 86400

renew_lifetime = 604800

forwardable = true

default_tgs_enctypes = des-cbc-crc aes des-cbc-md5 arcfour-hmac rc4

default_tkt_enctypes = des-cbc-crc aes des-cbc-md5 arcfour-hmac rc4

permitted_enctypes = des-cbc-crc aes des-cbc-md5 arcfour-hmac rc4

udp_preference_limit = 1

kdc_timeout = 10000

[realms]

CDH.HDP = {

kdc = hdp-2.com

admin_server = hdp-2.com

default_domain = cdh.hdp

}

[domain_realm]

cdh.hdp = CDH.HDP

 

 

kdc.conf

[kdcdefaults]

kdc_ports = 88

kdc_tcp_ports = 88

 

[realms]

CDH.HDP = {

  #master_key_type = aes256-cts

  acl_file = /var/kerberos/krb5kdc/kadm5.acl

  dict_file = /usr/share/dict/words

  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab

  supported_enctypes = aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal

}

 

Please help to resolve this.

Posts: 957
Topics: 1
Kudos: 228
Solutions: 121
Registered: ‎04-22-2014

Re: Data node failing after enabling kerberos

@sid2707,

 

Since they behavior you describe matches a known issue with a particular version of Kerberos, I think that is a good place to look first:

 

https://bugzilla.redhat.com/show_bug.cgi?id=1560951

 

check your krb5 packages and make sure that you do not have:

1.15.1-18.el7

 

If you do, that is known to caue problems for Java Kerberos.

 

Upgrading mit kerberos packages to 1.15.1-19 has been known to solve the trick

 

Highlighted
Expert Contributor
Posts: 338
Registered: ‎01-25-2017

Re: Data node failing after enabling kerberos

[ Edited ]

@bgooley Do you know if this issue exist with:

 

krb5-workstation-1.10.3-65.el6.x86_64
krb5-auth-dialog-0.13-6.el6.x86_64
krb5-libs-1.10.3-65.el6.x86_64

 

I expercienced the same issue with these packages but with the following error:

 

2017-10-23 06:56:03,908 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.lang.RuntimeException: Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP.  Using privileged resources in combination with SASL RPC data transfer protection is not supported.
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:1371)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1271)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:464)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2583)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2470)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2517)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2699)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2723)
2017-10-23 06:56:03,919 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-10-23 06:56:03,921 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
 
 
I can authinicate against the AD and can confirm that the ports used for the HDFS are below 1023
Explorer
Posts: 46
Registered: ‎03-25-2017

Re: Data node failing after enabling kerberos

Even for me kinit is working and zookeeper and namenode start but datanode fails to connect namenode and then complete cluster comes down
Posts: 957
Topics: 1
Kudos: 228
Solutions: 121
Registered: ‎04-22-2014

Re: Data node failing after enabling kerberos

@sid2707,

 

What ports did you change.  I thin you need both of these to be less than 1024 if you don't have HTTPS configured:

 

DataNode Transceiver Port

DataNode HTTP Web UI Port

Posts: 957
Topics: 1
Kudos: 228
Solutions: 121
Registered: ‎04-22-2014

Re: Data node failing after enabling kerberos

@Fawze and @sid2707,

 

Sorry... we need to separate this conversation I think since the issues differ.  I was responding to what @Fawze was saying regarding the DataNode not starting with the SASL message.

 

@sid2707,

 

I mentioned a possible cause relating to your krb5 libraries.  Please run the following on one of the hosts where datanodes are not functioning:

 

# rpm -qa |grep krb5

Explorer
Posts: 46
Registered: ‎03-25-2017

Re: Data node failing after enabling kerberos

Thanks @bgooley

I solved this by upgrading os and kerberos version. It works fine for me
now.

Thanks for your help
Announcements