Created 08-09-2016 09:23 PM
What are the steps to remove corrupted blocks from HDFS
Created 08-09-2016 09:43 PM
Perform below action as hdfs user:
The output of the fsck above will be very verbose, but it will mention which blocks are corrupt. We can do some grepping of the fsck above so that we aren't "reading through a firehose".
hdfs fsck / | egrep -v '^\.+ | grep -v replica | grep -v Replica
or
hdfs fsck hdfs://ip.or.host:50070/ | egrep -v '^\.+ | grep -v replica | grep -v Replica
This will list the affected files, and the output will not be a bunch of dots, and also files that might currently have under-replicated blocks (which isn't necessarily an issue). The output should include something like this with all your affected files.
/path/to/filename.fileextension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305/path/to/filename.fileextension: MISSING 1 blocks of total size 15620361 B
The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated?
If it's easy enough just to replace the file, that's the route I would take.
This command will move the corrupted file to the trash.
hdfs dfs -rm /path/to/filename.fileextension
hdfs dfs -rm hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension
Or you can skip the trash to permanently delete (which is probably what you want to do)
hdfs dfs -rm -skipTrash /path/to/filename.fileextension
hdfs dfs -rm -skipTrash hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension
As a hdfs user
If you run below command it will delete all under replicated and corrupted blocks, instead of following above by doing individually.
hdfs fsck / -delete
Created 08-09-2016 09:43 PM
Perform below action as hdfs user:
The output of the fsck above will be very verbose, but it will mention which blocks are corrupt. We can do some grepping of the fsck above so that we aren't "reading through a firehose".
hdfs fsck / | egrep -v '^\.+ | grep -v replica | grep -v Replica
or
hdfs fsck hdfs://ip.or.host:50070/ | egrep -v '^\.+ | grep -v replica | grep -v Replica
This will list the affected files, and the output will not be a bunch of dots, and also files that might currently have under-replicated blocks (which isn't necessarily an issue). The output should include something like this with all your affected files.
/path/to/filename.fileextension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305/path/to/filename.fileextension: MISSING 1 blocks of total size 15620361 B
The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated?
If it's easy enough just to replace the file, that's the route I would take.
This command will move the corrupted file to the trash.
hdfs dfs -rm /path/to/filename.fileextension
hdfs dfs -rm hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension
Or you can skip the trash to permanently delete (which is probably what you want to do)
hdfs dfs -rm -skipTrash /path/to/filename.fileextension
hdfs dfs -rm -skipTrash hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension
As a hdfs user
If you run below command it will delete all under replicated and corrupted blocks, instead of following above by doing individually.
hdfs fsck / -delete