Support Questions

Adija1 · ‎06-03-2018

Hello

I've noticed in Ambari under HDFS metrics that we have 2 blocks with corrupt replicas.
Running " hdfs fsck / " shows no corrupt blocks and system is healthy.
Running "hdfs dfsadmin-report" shows 2 corrupt replicas (same as Ambari dashboard)

I've restarted Ambari metrics & Ambari Agents on all nodes + Ambari-server as noted in one of the threads i came across but still - problem remains.

Ambari is 2.5.2

Any ideas how to fix this issue ?

Thanks

Adi

Shelton · ‎06-03-2018

@Adi Jabkowsky

Below is the procedure to remove the corrupt blocks or files

Locate the files have blocks that are corrupt.

$ hdfs fsck / | egrep -v '^\.+

or

$ hdfs fsck hdfs://ip.or.host:50070/ | egrep -v '^\.+

This will be a list the affected files, and the output will not be a bunch of dots, the output should include something like this with all your affected files.

Sample output

/path/to/filename.file_extension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305
/path/to/filename.file_extension: MISSING 1 blocks of total size 15620361 B

The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated? you have a replication factor of 1 so analyze well.

Remove the corrupted file(s)

This command will move the corrupted file to the trash incase you realise the files is importantyou still have an option of recovering it .

$ hdfs dfs -rm /path/to/filename.file_extension

When you use skip the trash to permanently delete if you are sure you really don't need that file.

$ hdfs dfs -rm -skipTrash /path/to/filename.file_extension

How to repair a corrupted file if it was not easy to replace?

$ hdfs fsck /path/to/filename/file_extension -locations -blocks -files

or

$ hdfs fsck hdfs://ip.or.hostname.of.namenode:50070/path/to/filename/file_extension -locations -blocks -files

You can track down the datanode where the corruption is and look through logs and determine what the issue is.

Please revert.

Adija1 · ‎06-04-2018

Thank you @Geoffrey Shelton Okot

However the fsck shows no corrupt blocks. The problem is with corrupt replicas.
That said - after alert disappeared ...
Not sure if to be glad or suspicious 🙂

Adi

fernando_lopez · ‎06-02-2023

This response is NOT to fix "files with corrupt replicas" but to find and fix files that are completely corrupt, that is that there are not good replicas to recover the files.

The warning of files with corrupt replicas is when the file has at least one replica which is corrupt, but the file can still be recovered from the remaining replicas.
In this case

hdfs fsck /path ...

will not show these files because it considere these healty.
These files and the corrupted replicas are only reported by the command

hdfs dfsadmin -report

and as far as I known there is no direct command to fix this. Only way I have found I to wait for the Hadoop cluster to health itself by reallocating the bad replicas from the good ones.

Cloudera Community

Support Questions

Blocks with corrupted replicas