Created 05-08-2016 02:15 PM
We are using HDP 2.3, Ambari 2.1.2
We have connected 3 datanodes to the cluster (slave1, slave2, slave3)
Ambari server is running on a distinct server.
NameNode is running on slave1.
Dashboard shows:
DataNodes Live 1/3 (0 dead, 0 decompressioning)
NodeManagers Live 3/3
I can access all three data nodes from ambari "Hosts". I can restart the services on these data nodes.
NameNode UI shows 1 live data node (slave1 - same one with NameNode installed on).
Created 05-09-2016 09:05 AM
Thank you!
in datanode log we got the following error:
datanode.DataNode (BPServiceActor.java:run(828)) - Initialization failed for Block pool BP-1743137494-192.168.41.10-1459773600716 (Datanode Uuid 41b4525d-b168-496d-a985-a2c2b5e889c1) service to slave1/192.168.41.10:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.41.11, hostname=192.168.41.11): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=41b4525d-b168-496d-a985-a2c2b5e889c1, infoPort=50075, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-56;cid=CID-655f2b29-eb31-4c17-81b6-b507e7f6cb5f;nsid=1493722973;c=0)
The problem was in the reverse DNS resolving (a check performed by the namenode).
This article was very helpful: why-datanode-is-denied-communication-with-namenode
Created 05-08-2016 02:36 PM
Can you please run below commands as hdfs user and share output?
#Command 1:
hadoop dfsadmin -report
#Command 2:
hadoop dfsadmin -refreshNodes
After running above command, please check your Namenode UI to see if live nodes are 3/3. If possible, please share screenshot of NN UI.
Created on 05-08-2016 05:45 PM - edited 08-19-2019 01:48 AM
1. report:
Configured Capacity: 51636727808 (48.09 GB) Present Capacity: 28807827456 (26.83 GB) DFS Remaining: 28329484288 (26.38 GB) DFS Used: 478343168 (456.18 MB) DFS Used%: 1.66% Under replicated blocks: 1023 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (1): Name: 192.168.41.10:50010 (stream-srv01.streaming.rnd.qa) Hostname: stream-srv01.streaming.rnd.qa Decommission Status : Normal Configured Capacity: 51636727808 (48.09 GB) DFS Used: 478343168 (456.18 MB) Non DFS Used: 22828900352 (21.26 GB) DFS Remaining: 28329484288 (26.38 GB) DFS Used%: 0.93% DFS Remaining%: 54.86% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Sun May 08 20:43:40 IDT 2016
2. refresh command output: Refresh nodes successful
3. NameNode UI shows 1 live data node (the one with namenode installed on):
Created 05-08-2016 06:05 PM
Is the datanode service within each slave node appearing started in Ambari?. Also on any one of the slave node that is not working check for errors on /var/log/hadoop/hdfs/<datanodelog> file?
Also on slave 1 check namenode log file to see if the datanodes are trying to heartbeat to namenode.
Regards
Pranay Vyas
Created 05-08-2016 06:53 PM
It might be the case that datanode process is kind of in hung state and and is not responding the status update.
Did you tried restarting the datanode service?
can you share namenode and datanode logs ?
Created 05-09-2016 09:05 AM
Thank you!
in datanode log we got the following error:
datanode.DataNode (BPServiceActor.java:run(828)) - Initialization failed for Block pool BP-1743137494-192.168.41.10-1459773600716 (Datanode Uuid 41b4525d-b168-496d-a985-a2c2b5e889c1) service to slave1/192.168.41.10:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.41.11, hostname=192.168.41.11): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=41b4525d-b168-496d-a985-a2c2b5e889c1, infoPort=50075, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-56;cid=CID-655f2b29-eb31-4c17-81b6-b507e7f6cb5f;nsid=1493722973;c=0)
The problem was in the reverse DNS resolving (a check performed by the namenode).
This article was very helpful: why-datanode-is-denied-communication-with-namenode
Created 12-07-2016 02:50 PM
I deployed a clusteron vmware and becuae I was having issues to add additional node, I have cloned a VM, changed hostname and up and added the datanode manually. As I was seeing just one datanode on dashboard, I tried making all changes as suggested here which actually didnt work. As soon as I shutdown the an active datanode , the second one started to show up. Any clues what was happening in back end?
Created 12-08-2016 12:19 PM
my assumption was correct the datanodes (prbly everynode) will have a uuid which was same and hence this issue, i removed the install software, diectories and files, then reregisterd which worked fine later