- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Databode uuid unassigned
- Labels:
-
Apache Hadoop
Created ‎05-22-2018 09:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a 4 node cluster (2 master & 2 Data nodes) - fresh Installation.
One of the Datanode is not coming up - 2018-05-22 14:37:56,024 ERROR datanode.DataNode (BPServiceActor.java:run(780)) - Initialization failed for Block pool <registering> (DatanodeUuid unassigned) service to Host1.infosolco.net/10.215.78.41:8020. Exiting. java.io.IOException: All specified directories are failed to load.
When I see the VERSION file, the :
root@Datanode02:/spark/hdfs/data/current # cat VERSION
#Tue May 22 14:00:02 PDT 2018
storageID=DS-0009b75a-e67a-4623-b7a2-12bf395c1d61
clusterID=CID-eb6df30f-7f16-4f94-826c-c7640e1e45a2
cTime=0
datanodeUuid=f005656a-673e-4c97-b25a-e19f04e1ec94
storageType=DATA_NODE
layoutVersion=-56
__________________
root@Datanode01:/spark/hdfs/data/current # cat VERSION
#Tue May 22 14:00:02 PDT 2018
storageID=DS-0009b75a-e67a-4623-b7a2-12bf395c1d61
clusterID=CID-eb6df30f-7f16-4f94-826c-c7640e1e45a2
cTime=0
datanodeUuid=f005656a-673e-4c97-b25a-e19f04e1ec94
storageType=DATA_NODE
layoutVersion=-56
I see both datanodes have same Uuid, and 2nd data node is not coming up.
Please suggest!
Created ‎05-23-2018 05:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try to perform the following steps on the failed DataNode:
- Get the list of DataNode directories from /etc/hadoop/conf/hdfs-site.xml using the following command:
$ grep -A1 dfs.datanode.data.dir /etc/hadoop/conf/hdfs-site.xml <name>dfs.datanode.data.dir</name> <value>/data0/hadoop/hdfs/data,/data1/hadoop/hdfs/data,/data2/hadoop/hdfs/data, /data3/hadoop/hdfs/data,/data4/hadoop/hdfs/data,/data5/hadoop/hdfs/data,/data6/hadoop/hdfs/data, /data7/hadoop/hdfs/data,/data8/hadoop/hdfs/data,/data9/hadoop/hdfs/data</value>
- Get datanodeUuid by grepping the DataNode log:
$ grep "datanodeUuid=" /var/log/hadoop/hdfs/hadoop-hdfs-datanode-$(hostname).log | head -n 1 | perl -ne '/datanodeUuid=(.*?),/ && print "$1\n"' 1dacef53-aee2-4906-a9ca-4a6629f21347
- Copy over a VERSION file from one of the <dfs.datanode.data.dir>/current/ directories of a healthy running DataNode:
$ scp <healthy datanode host>:<dfs.datanode.data.dir>/current/VERSION ./
- Modify the datanodeUuid in the VERSION file with the datanodeUuid from the above grep search:
$ sed -i.bak -E 's|(datanodeUuid)=(.*$)|\1=1dacef53-aee2-4906-a9ca-4a6629f21347|' VERSION
- Blank out the storageID= property in the VERSION file:
$ sed -i.bak -E 's|(storageID)=(.*$)|\1=|' VERSION
- Copy this modified VERSION file to the current/ path of every directory listed in dfs.datanode.data.dir property of hdfs-site.xml:
$ for i in {0..9}; do cp VERSION /data$i/hadoop/hdfs/data/current/; done
- Change permissions on this VERSION file to be owned by hdfs:hdfs with permissions 644:
$ for i in {0..9}; do chown hdfs:hdfs /data$i/hadoop/hdfs/data/current/VERSION; done $ for i in {0..9}; do chmod 664 /data$i/hadoop/hdfs/data/current/VERSION; done
- One more level down, there is a different VERSION file located under the Block Pool current folder at:
/data0/hadoop/hdfs/data/current/BP-*/current/VERSION
This file does not need to be modified -- just place then in the appropriate directories. - Copy over this particular VERSION file from a healthy DataNode into the current/BP-*/current/ folder for each directory listed in dfs.datanode.data.dir of hdfs-site.xml:
$ scp <healthy datanode host>:<dfs.datanode.data.dir>/current/BP-*/current/VERSION ./VERSION2 $ for i in {0..9}; do cp VERSION2 /data$i/hadoop/hdfs/data/current/BP-*/current/VERSION; done
- Change permissions on this VERSION file to be owned by hdfs:hdfs with permissions 644:
$ for i in {0..9}; do chown hdfs:hdfs /data$i/hadoop/hdfs/data/current/BP-*/current/VERSION; done $ for i in {0..9}; do chmod 664 /data$i/hadoop/hdfs/data/current/BP-*/current/VERSION; done
- Restart DataNode from Ambari.
- The VERSION file located at <dfs.datanode.data.dir>/current/VERSION will have its storageID repopulated with a regenerated ID.
If any data is not an issue (say, for example, the node was previously in a different cluster, or was out of service for an extended time), then
- delete all data and directories in the dfs.datanode.data.dir (keep that directory, though),
- restart the data node daemon or servic
