Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Expert Contributor

Missing Block

Mark missing if all of the block replicas of that file is not reported to Namenode.

Corrupt Block

Mark corrupt if all of the block replicas of that file is corrupted (Or) none of them are reported to Namenode.

The checklist must be done before you confirm block is corrupted/missing.

  1. Check if all datanodes are running in the cluster
  2. Check if you see dead datanodes
  3. Check if disk failure from multiple datanode
  4. Check if disk out of space from multiple datanode
  5. Check if block report is rejected by namenode (It can be seen from namenode log as a warning/error)
  6. Check if you changed any config groups
  7. Check if block physically exists in local filesystem or removed by users unknowingly. Ex: "find <dfs.datanode.data.dir> -type f -iname <blkid>*". Repeat the same step in all datanodes
  8. Check if too many blocks hosted in a single datanode
  9. Check if block report fails with "exceeding max RPC size", default 64 MB. You can see this warning from namenode log "Protocol message was too large. May be malicious"
  10. Check if mount point is unmounted because of filesystem failure
  11. Check if block is written into root volume because of disk auto unmount. Data might be hidden if you remount the filesystem on top of existing datanode dir.

Note: You will lose data if you run "hdfs fsck / -delete". Please ensure you have done all checklist

3,883 Views
Comments
avatar
Explorer

Run hdfs fsck delete.

And found that datanode config wrong.

Less 2 directories datanode store direcory config.

Is there any possible way to rebuild the lost corrupt block? Much thanks