Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Datanode denied communication with namenode

Datanode denied communication with namenode

Contributor

We are seeing below error, when trying to restart a datanode, while troubleshooting , found out that ipaddress specified in /etc/hosts is not being used rather its using another ipaddress associated with FQDN.

hostname -I lists 3 ipaddress (not sure why).

2016-03-21 18:12:37,325 ERROR datanode.DataNode (BPServiceActor.java:run(828)) - Initialization failed for Block pool BP-2035900203-135.25.27.90-1453931345036 (Datanode Uuid 9587a317-6299-4dcf-87e7-10a708e242f2) service to xxx.domain.com/xx.xx.xx.xx:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=xx.xx.xx.xx, hostname=xx.xx,xx.xx): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=9587a317-6299-4dcf-87e7-10a708e242f2, infoPort=50075, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-56;cid=CID-2bc3fbce-df3a-4ad8-9303-aedae5392f82;nsid=2086512923;c=0)

at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:873)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4531)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1286)

at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)

at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28849)

3 REPLIES 3

Re: Datanode denied communication with namenode

Mentor

@kjilla

Thats a typical configuration issue. There are 2 schools of though!

1. During the Ambari cluster setup all the participating host should use FQDN at NOT IP addresses reasons are very obvious imagine the mess if there is an IP address change! DNS resolution is the best solution

2. If using the above option isn't an option then modify /etc/ambari-agent/conf/ambari-agent.ini here is an Ambari workaround this is an old doument

Re: Datanode denied communication with namenode

Contributor

Fixed the issue by updating /etc/hosts with same network ipaddress and restarting .

Highlighted

Re: Datanode denied communication with namenode

Contributor
An update to this post for practitioners who land here to solve similar problems:
We added a new node to the cluster by Ambari GUI (using FQDN during install). We could see the node in the HOSTS list in AMBARI and RESOURCE MANAGER UI, but it was not accepting containers.
Also we could not see the new node in the -> Datanode Tab of name node view (http://<Active Name Node Host FQDN>:50070/dfshealth.html#tab-overview)

Executed this :

 hadoop dfsadmin -refreshNodes

Followed by this :

hadoop dfsadmin -report

But could not see the new node in the report

We looked at the Datanode error log:

2017-09-01 15:25:59,859 INFO  datanode.DataNode (BPServiceActor.java:register(768)) - Block pool BP-1930018148-58.162.144.211-1462411884867 
(Datanode Uuid 734af6ba-e5a0-4f6e-81d4-7ad089f6d685) service to dh02.int.belong.com.au/XX.XXX.XXX.XXX:8020 beginning handshake with NN2017-09-01 15:25:59,864 ERROR datanode.DataNode (BPServiceActor.java:run(828)) - 
*Initialization failed for Block pool BP-1930018148-58.162.144.211-1462411884867 (Datanode Uuid 734af6ba-e5a0-4f6e-81d4-7ad089f6d685) service to dh02.int.belong.com.au/XX.XXX.XXX.XXX:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=YY.YYY.YYY.YY, hostname=YY.YYY.YYY.YY)*: DatanodeRegistration(0.0.0.0:50010, 
datanodeUuid=734af6ba-e5a0-4f6e-81d4-7ad089f6d685, infoPort=50075, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-56;cid=CID-0019b609-89c6-421f-b98b-21607b8a21c6;nsid=1515412344;c=0)        
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:883)        
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4541)        
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1350)        
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:96)        
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:29062)        
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)        
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)        
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)        at java.security.AccessController.doPrivileged(Native Method)        
at javax.security.auth.Subject.doAs(Subject.java:422)       
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)        
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)

and realized that

Datanode denied communication with namenode because hostname cannot be resolved

In the new node , execute the following to verify the IP

hostname -f  

We added the following line in the /etc/hosts file of Active & Secondary Name node .

YY.YYY.YYY.YY dh10.int.belong.com.au dh10 Restarted Datanode and Node Manager 
Restarted Datanode and Node Manager on the new node. On refreshing Namenode dashboard i could see the new node listed. (Note - We did not have entries for other datanode in the /etc/hosts file,hence we realized that what we did was to just work around the problem)

I discussed this issue with Server Admin to check the DNS resolution of the new box , as our cluster used a DNS resolver rather than /etc/hosts for DNS/hostname resolution . The check did not seem to be an issue. Then we analysed the DataNode log again to see that the error indicated new node's hostname as an IP address :
(ip=YY.YYY.YYY.YY, hostname=YY.YYY.YYY.YY)
This then became a reverse DNS lookup problem, Server Admin cross checked this in the DNS Server to find that reverse DNS was not set up for the new data node. Once the settings were updated , i went ahead and removed the entry in the /etc/hosts files in the Active & Secondary Name node , restarted datanode and nodemanager in the new node. It has all worked out fine since then.
Don't have an account?
Coming from Hortonworks? Activate your account here