Support Questions

Find answers, ask questions, and share your expertise

DataNode service failed with Exception in secureMain

avatar
Contributor

Hello!

 

I'm trying to add a new DataNode in a running cluster.

The cluster is in HA for HDFS (NNs) and for Yarn (RMs) and is secured by kerberos integration.

When I performed the necessary steps to add a new DN and started the hadoop-hdfs-datanode service the new node didn't shows up in the list of DNs (I perfomed a refresh on the NNs).

In /var/log/hadoop-hdfs/hadoop-hdfs-datanode-mynode.log there is nothing logged.

The output for command "hdfs datanode" is:

 

2017-06-26 15:42:58,544 INFO security.UserGroupInformation: Login successful for user hdfs/my.datanode.fqdn@MY.REALM.FQDN using keytab file /path/to/hdfs.keytab
2017-06-26 15:42:59,292 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2017-06-26 15:42:59,333 INFO impl.MetricsSinkAdapter: Sink collectd started
2017-06-26 15:42:59,385 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2017-06-26 15:42:59,385 INFO impl.MetricsSystemImpl: DataNode metrics system started
2017-06-26 15:42:59,404 INFO datanode.DataNode: File descriptor passing is enabled.
2017-06-26 15:42:59,407 INFO datanode.DataNode: Configured hostname is my.datanode.fqdn

2017-06-26 15:42:59,415 FATAL datanode.DataNode: Exception in secureMain
java.lang.RuntimeException: Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP. Using privileged resources in combination with SASL RPC data transfer protection is not supported.
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:1205)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1106)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:451)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2406)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2293)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2340)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2517)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2541)
2017-06-26 15:42:59,436 INFO util.ExitUtil: Exiting with status 1
2017-06-26 15:42:59,475 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at my.datanode.fqdn/my-datanode-ip-address
************************************************************/

 

Thanks for your help!

 

Guido.

1 ACCEPTED SOLUTION

avatar
Contributor

Finally the node is up and running!

Lot's of java custom packages that I forgot to include in the new DN.

I didn't get clear how missing jar files can cause such a strange behavior.

Thanks for your help!

 

Guido.

View solution in original post

9 REPLIES 9

avatar
Champion

Could you check the parameter value of this in your ruining datanode  

hdfs-site.xml 

dfs.data.transfer.protection = ?

 

 

avatar
Contributor

Hello @csguna!

 

The property "dfs.data.transfer.protection" is not present in hdfs-site.xml file.

 

Thanks!

 

Guido.

avatar
Contributor

The weird thing here is datanode service not loggin anything. 

The log in the /var/hadoop-hdfs/ has no text, only the out file but in the out file always appears to be fine, the service is running but the node is not present in the cluster and is not logging.

There is a way to debbug the datanode? 

 

Thanks!

avatar
Champion

Did you set your datanode secure port to 1004 ? 

 

Since I dont know the exact configuration in your system I assume that enabling the below should fix your problem based on the error that you had shown.

 

hadoop.rpc.protection = privacy (core-site.xml)
dfs.encrypt.data.transfer to true (hdfs-site.xml)

 

Restart  all the dameons to effect . 

avatar
Contributor

Thanks @csguna!

 

I run the same command (hdfs datanode) in other nodes that are up and running and the error is the same.

The strange thing here is my service is not logging, I mean hadoop-hdfs-datanode-mynode.log is empty, blanck, nothing...

Yarn nodemanager is up and running in the same node and it's working fine, picked-up tasks to do inmmediatelly.

If I can't get logs from the datanode service (wich is running dummly) I wouldn't be able to do anything.

Any help is very welcome!

 

Thanks.

 

Guido.

avatar
Champion

Did you try bounchign the cluster back after those changes in hdfs , core site xml ? 

please let me know 

avatar
Contributor

@csguna I'll try to do it.

I realized the cluster is full of custom java classes and dependences so I've to take a deep dive into the config in order to find out what is happening.

Thus the cluster is pure CDH, no Cloudera Manager, so any issue is a little bit complex to solve.

As far as I can solve this I'll let you know.

Thanks!

avatar
Contributor

Finally the node is up and running!

Lot's of java custom packages that I forgot to include in the new DN.

I didn't get clear how missing jar files can cause such a strange behavior.

Thanks for your help!

 

Guido.

avatar
New Contributor

How did you resolve the issue? I am facing a similar problem with datanode not starting