Support Questions

Find answers, ask questions, and share your expertise

Unable to Start DataNode in kerberos cluster

avatar
Master Collaborator

Hi Guys,

I'm unable to start DataNode after enabling the kerberos in my cluster. I tried all the suggested solutions in the community and Internet and without any success to solve it.

All other servers started and my cluster and node able to authenticate against the active directory.

Here the important config in the HDFS:

dfs.datanode.http.address 1006

dfs.datanode.address 1004

hadoop.security.authentication kerberos

hadoop.security.authorization true

hadoop.rpc.protection authentication

Enable Kerberos Authentication for HTTP Web-Consoles true

and here is the log: STARTUP_MSG: java = 1.8.0_101 ************************************************************/ 2017-10-23 06:56:02,698 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT] 2017-10-23 06:56:03,449 INFO org.apache.hadoop.security.UserGroupInformation: Login successful for user hdfs/aopr-dhc001.lpdomain.com@LPDOMAIN.COM using keytab file hdfs.keytab 2017-10-23 06:56:03,812 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2017-10-23 06:56:03,891 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2017-10-23 06:56:03,891 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2017-10-23 06:56:03,899 INFO org.apache.hadoop.hdfs.server.datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576 2017-10-23 06:56:03,900 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: File descriptor passing is enabled. 2017-10-23 06:56:03,903 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is aopr-dhc001.lpdomain.com 2017-10-23 06:56:03,908 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.lang.RuntimeException: Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP. Using privileged resources in combination with SASL RPC data transfer protection is not supported. at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:1371) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1271) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:464) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2583) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2470) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2517) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2699) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2723) 2017-10-23 06:56:03,919 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2017-10-23 06:56:03,921 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at aopr-dhc001.lpdomain.com/10.16.144.131 ************************************************************/ 2017-10-23 06:56:08,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = aopr-dhc001.lpdomain.com/10.16.144.131 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.6.0-cdh5.13.0=======================

34 REPLIES 34

avatar
Master Collaborator

Tried but still getting the same error,

Below attached my AD supported encryption

ad-conf-in-ad.pngad-part-2.png

avatar
Master Mentor

@Fawze AbuJaber

Can you explain to me the history of your setup?

Cluster (hosts) and the Kerberos setup?

My assumption is your target is to have a Linux based cluster but use the AD as KDC is that right?

I just need to know the background to understand the current stand to be able to help better


avatar
Master Collaborator

@Geoffrey Shelton Okot

I have a fully linux cluster: 5 datanodes, 1 application node for the oozie and other application like Hue, I have 2 HA nodes and one client server and 1 servers for Cloudera manager.

our linux systems are using aes256 so i added this to the krb5 conf and enable it in the active directory.

I'm using the Active directory as a kerberos service:

I attached before the AD snapshots and here the conf that related to the HDFS.

I'm able to authnticate against the active direcotry using kinit -V

hdfs-conf-4.pnghdfs-conf-of-the-data-nodes-ports.pnghdfs-conf.pnghdfs-conf2.png

avatar
Master Mentor

@Fawze AbuJaber

Can you comment out the following in the krb5.conf by putting a pound (#) sign like below

#default_tgs_enctypes = rc4-hmac
#default_tkt_enctypes = rc4-hmac
#permitted_enctypes = rc4-hmac 

Then restart the kdc server an retest. Can you also upload the krb5kdc and kadmind logs

avatar
Master Collaborator

@Geoffrey Shelton Okot

When you see restart the KDC server, you mean to restart the active directory server?

how i can avoid that?

avatar
Master Mentor

@Fawze AbuJaber

Can you try without rebooting the AD ?

avatar
Master Collaborator

avatar
Master Mentor

@Fawze AbuJaber

Can you please double check if your JDK has the JCE installed to it, which is one of the requirement for the Kerberization. JCE installation steps can be found here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_security/content/_distribute_and_install...

Hence please check the output of the following command:

# zipgrep  CryptoAllPermission  $JAVA_HOME/jre/lib/security/local_policy.jar
default_local.policy:    permission javax.crypto.CryptoAllPermission; 

.
Also it will be good to see if your krb5-conf contains the property: 'udp_preference_limit = 1' under '[libdefaults]:'

Also as mentioned earlier did you find any log file with this pattern "hs_err_pid" inside your DataNode host / log directory ?

.

avatar
Master Collaborator
@Jay SenSharma

ran the command and get the same output shown and udp_preference_limit = 1' under '[libdefaults]:

avatar
Master Collaborator

Hi Guys,

Really appreciate your quick responses and readiness to help, i'm almost 1 week on the issue without success even i tried all the documentations i found over the internet.

Starting to give up ........ 😞