Support Questions

Find answers, ask questions, and share your expertise

HDFS Datanode Uuid unassigned error : after 2 days of datanode down and recovery

avatar
Contributor

Hi,

I am getting the below error after checking log ( log while starting datanode). Datanode stoping immediately after start.

tailf /var/log/hadoop/hdfs/hadoop-hdfs-datanode-[hostname].log

2018-07-24 10:45:18,282 INFO  common.Storage (Storage.java:tryLock(776)) - Lock on /mnt/dn/sdl/datanode/in_use.lock acquired by nodename 55141@dat01.node
2018-07-24 10:45:18,283 WARN  common.Storage (DataStorage.java:loadDataStorage(449)) - Failed to add storage directory [DISK]file:/mnt/dn/sdl/datanode/
java.io.FileNotFoundException: /mnt/dn/sdl/datanode/current/VERSION (Permission denied)
        at java.io.RandomAccessFile.open0(Native Method)
        at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
        at org.apache.hadoop.hdfs.server.common.StorageInfo.readPropertiesFile(StorageInfo.java:245)
        at org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:231)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:779)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:322)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:438)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:417)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:595)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1543)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1504)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:272)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:768)
        at java.lang.Thread.run(Thread.java:748)
2018-07-24 10:45:18,283 ERROR datanode.DataNode (BPServiceActor.java:run(780)) - Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to nam02.node/192.168.19.3:8020. Exiting.
java.io.IOException: All specified directories are failed to load.
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:596)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1543)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1504)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:272)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:768)
        at java.lang.Thread.run(Thread.java:748)
2 REPLIES 2

avatar
Rising Star

@Rambabu Chamakuri It seems that the permissions on the VERSION file are wrong, see below:

java.io.FileNotFoundException: /mnt/dn/sdl/datanode/current/VERSION (Permission denied)

Check the permission on this VERSION file:

ls -lh /mnt/dn/sdl/datanode/current/VERSION

The file should be owned by "hdfs:hdfs" and permissions set to 644. If they are not then change accordingly:

chown hdfs:hdfs /mnt/dn/sdl/datanode/current/VERSION
chmod 644 /mnt/dn/sdl/datanode/current/VERSION

And restart the Datanode.

Let me know if it help in solving your issue

avatar
Contributor

@Pedro Andrade

thanks for your reply. I checked the permissions it was fine (The file owned by "hdfs:hdfs" and permissions set to 644).

The node was out of service for an extended time, So I followed the below steps

  • delete all data and directories in the dfs.datanode.data.dir (keep that directory, though). or Move the data for example : $ mv /mnt/dn/sdl/datanode/current /mnt/dn/sdl/datanode/current.24072018
  • restart the data node daemon or service
  • Later we can delete the backup data $ rm -rf /mnt/dn/sdl/datanode/current.24072018

Now Datanode is up and live.... Thanks for hortonworks help and contribution.

Reference : https://community.hortonworks.com/questions/192751/databode-uuid-unassigned.html