Created 02-23-2016 09:12 AM
In my HDFS status summary, I see the following messages about missing and under-replicated blocks:
2,114 missing blocks in the cluster. 5,114,551 total blocks in the cluster. Percentage missing blocks: 0.04%. Critical threshold: any.
On executing the command : hdfs fsck -list-corruptfileblocks
I got following output : The filesystem under path '/' has 2114 CORRUPT files
What is the best way to fix these corrupt files and also fix the underreplicated block problem?
Created 02-23-2016 09:19 AM
Hi Pranshu,
You can follow the instructions in the link below:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html
Regards,
Karthik Gopal
Created 02-23-2016 09:19 AM
Hi Pranshu,
You can follow the instructions in the link below:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html
Regards,
Karthik Gopal
Created 02-23-2016 09:28 AM
Created 02-23-2016 10:03 AM
You can try to recover some missing blocks by making sure that all your Data nodes and all disks on them are healthy and running. If they are, and you still have missing blocks the only way out is to delete files with missing blocks, either one by one or all of them at once using the "fsck <path> -delete" command.
Regarding under replicated blocks, HDFS is suppose to recover them automatically (by creating missing copies to fulfill the replication factor). If after a few days it doesn't, you can trigger the recovery by running the balancer, or as mentioned in another answer run the "setrep" command.
Created 02-23-2016 10:40 AM
Created 02-23-2016 12:39 PM
@Pranshu Pranshu, You have 2 options ...Another link
"The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated?
If it's easy enough just to replace the file, that's the route I would take."
Created 02-23-2016 02:46 PM
@Pranshu Pranshu, If the original question is answered then please accept the best answer.
Created 02-24-2016 09:26 AM
It seems like, the replication factor is 1 my case. How to get it recovered from DR cluster. ?
Created 02-24-2016 09:58 AM
@Pranshu Pranshu, You can use "setrep" command for setting replication factor for files and directories:
Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>
Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.
Options:
Example:
To set replication of an individual file to 3, you can use below command:
./bin/hadoop dfs -setrep -w 3 /path/to/file
You can also do this recursively. To change replication of entire HDFS to 3, you can use below command:
./bin/hadoop dfs -setrep -R -w 3 /
Exit Code:
Returns 0 on success and -1 on error.
-
Hope this helps you to solve this problem?
Created 11-09-2017 09:51 PM
I have a similar problem with a filesystem/namenode is safemode because of underreplicated blocks. My problem is that the "hdfs dfs -setrep -w 3 /path/to/file" fails because the filesystem is in safemode. If I am in safemode because of underreplicated blocks and the command to fix that doesn't work if you're in safemode, what can you do?
I've tried the command to leave safemode and it seems to work, but it goes back into safemode within a VERY short time.