12-10-2018 08:37 AM - last edited on 12-10-2018 12:22 PM by cjervis
I have installed HDFS in a 12 node cluster using Cloudera manager. It is deployed in EC2(AWS) instances. All these EC2 instances have 2 network interfaces - eth0 and eth1. eth1 has static ip and eth0 has ip which changes when instances are rebooted. Lets say eth0 ip is 'ABC' and eth1 ip is 'XYZ'. In my hosts file (/etc/hosts) i have made entry for all nodes fqdn and ip (ip of eth1). For some reason when data nodes try to connect to Name node it uses ip of eth0 (which is 'ABC' in this case). It shows below error and fails.
Initialization failed for Block pool BP-1423100917-*name_node_host*-1544213589860 (Datanode Uuid 160d6133-54f1-4a29-a6f0-0e52c0c59708) service to *NAME_NODE_HOSTNAME*.net/*NAME_NODE_IP*:8022 Datanode denied communication with namenode because hostname cannot be resolved (ip=ABC, hostname=ABC): DatanodeRegistration(XYZ, datanodeUuid=160d6133-54f1-4a29-a6f0-0e52c0c59708, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=cluster8;nsid=2080909946;c=0)
I have tried below options to fix this issue. But it did not work
Setting property - dfs.datanode.dns.interface to 'eth1' in both datanode and namenode and restarted hdfs service from cloudera manager UI. Also tried changing it only for data node or namenode. (hdfs-site.xml)
Setting property -dfs.namenode.datanode.registration.ip-hostname-check to 'false' in both data nodes and namenode and restarted hdfs service from cloudera manager UI. Also tried changing it only for data node or namenode.(hdfs-site.xml)
Most of previous posts related to this error points to above mentioned parameters. But it did not work for me. Has anyone faced same issue?
Version - 5.15.1