- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to fix missing and under replicated blocks?
- Labels:
-
Apache Hadoop
Created 02-23-2016 09:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my HDFS status summary, I see the following messages about missing and under-replicated blocks:
2,114 missing blocks in the cluster. 5,114,551 total blocks in the cluster. Percentage missing blocks: 0.04%. Critical threshold: any.
On executing the command : hdfs fsck -list-corruptfileblocks
I got following output : The filesystem under path '/' has 2114 CORRUPT files
What is the best way to fix these corrupt files and also fix the underreplicated block problem?
Created 02-23-2016 09:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pranshu,
You can follow the instructions in the link below:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html
Regards,
Karthik Gopal
Created 02-23-2016 09:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pranshu,
You can follow the instructions in the link below:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html
Regards,
Karthik Gopal
Created 02-23-2016 09:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 02-23-2016 10:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try to recover some missing blocks by making sure that all your Data nodes and all disks on them are healthy and running. If they are, and you still have missing blocks the only way out is to delete files with missing blocks, either one by one or all of them at once using the "fsck <path> -delete" command.
Regarding under replicated blocks, HDFS is suppose to recover them automatically (by creating missing copies to fulfill the replication factor). If after a few days it doesn't, you can trigger the recovery by running the balancer, or as mentioned in another answer run the "setrep" command.
Created 02-23-2016 10:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- get the full details of the files which are causing your problem using hdfs fsck / -files -blocks -locations
- if it is not replicating on your own run a balancer
- if you are SURE these files are not needed and would like to just eliminate the error, you can run this command to automatically delete the corrupted files hdfs fsck / -delete
Created 02-23-2016 12:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Pranshu Pranshu, You have 2 options ...Another link
"The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated?
If it's easy enough just to replace the file, that's the route I would take."
Created 02-23-2016 02:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Pranshu Pranshu, If the original question is answered then please accept the best answer.
Created 02-24-2016 09:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems like, the replication factor is 1 my case. How to get it recovered from DR cluster. ?
Created 02-24-2016 09:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Pranshu Pranshu, You can use "setrep" command for setting replication factor for files and directories:
setrep
Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>
Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.
Options:
- The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
- The -R flag is accepted for backwards compatibility. It has no effect.
Example:
To set replication of an individual file to 3, you can use below command:
./bin/hadoop dfs -setrep -w 3 /path/to/file
You can also do this recursively. To change replication of entire HDFS to 3, you can use below command:
./bin/hadoop dfs -setrep -R -w 3 /
Exit Code:
Returns 0 on success and -1 on error.
-
Hope this helps you to solve this problem?
Created 11-09-2017 09:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a similar problem with a filesystem/namenode is safemode because of underreplicated blocks. My problem is that the "hdfs dfs -setrep -w 3 /path/to/file" fails because the filesystem is in safemode. If I am in safemode because of underreplicated blocks and the command to fix that doesn't work if you're in safemode, what can you do?
I've tried the command to leave safemode and it seems to work, but it goes back into safemode within a VERY short time.
