Created 02-15-2018 08:11 AM
Dear All,
I did a new installation of 2.6.2.0 with one name and 2 data nodes. History server service is not getting started due to the below error. Both datanode services are started and running. Also because of this issue, I'm not able to copy any files to hdfs because datanode is not detected and no information is passed namenode. Looks like network issue between name and data nodes.
"
{
"RemoteException": {
"exception": "IOException",
"javaClassName": "java.io.IOException",
"message": "Failed to find datanode, suggest to check cluster health. excludeDatanodes=null"
}
}
"
$ hadoop fs -copyFromLocal /tmp/test/ambari.repo /test/
18/02/15 15:12:37 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test/ambari.repo._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1709)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3337)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3261)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
$ hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Solutions I have tried,
1. Verified /etc/hosts file and did dns lookup from edge,name & datanodes to all other nodes and it is resolving properly.
2. Added in hdfs-site.xml the below entries and restarted the services.
dfs.client.use.datanode.hostname=true
dfs.datanode.use.datanode.hostname=true
dfs.namenode.datanode.registration.ip-hostname-check=false
3. 50010 port is open on datanodes
4. 50070 open on namenodes
5. Did a clean reboot of all nodes and services.
Still issue remains the same? On hortonworks links they gave just port numbers. Just want to know what port should be opened on name and data nodes and which node will access that? This environment is on AWS and I would need specify the destination host which access this port for communication.
Appreciate your help. Thank you.
Created 02-21-2018 07:52 AM
Hi All,
I'm able to fix the issue, you need to keep open ports 0-65535 on AWS security group side to communicate between nodes. This solved my problem. Thanks.
Created 02-17-2018 06:09 AM
Please open the ports which are listed here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_reference/content/hdfs-ports.html
Created 02-17-2018 06:20 AM
The link does not tell it needs to be open between name and data nodes or edge and name nodes etc.. For the above error im wondering what is getting missed? Anything else i can try. Thank you
Created 02-21-2018 07:52 AM
Hi All,
I'm able to fix the issue, you need to keep open ports 0-65535 on AWS security group side to communicate between nodes. This solved my problem. Thanks.