Created on 10-27-2013 12:09 AM - edited 09-16-2022 01:49 AM
Cloudera manager is showing Bad health status for hdfs. For ridding hadoop of corrupt files, I gave fsck command.
The command was:
$ hadoop fsck / -delete
This command was issued as user 'hdfs'
However, the command ends with an internal error message as below:
-----------------------------------------------------------------------------------
Status: CORRUPT
Total size: 1723577901 B
Total dirs: 76
Total files: 123 (Files currently being written: 1)
Total blocks (validated): 126 (avg. block size 13679189 B)
********************************
CORRUPT FILES: 2
CORRUPT BLOCKS: 2
********************************
Minimally replicated blocks: 126 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 126 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 2
Missing replicas: 252 (66.666664 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Sun Oct 27 12:21:49 IST 2013 in 35 milliseconds
FSCK ended at Sun Oct 27 12:21:49 IST 2013 in 36 milliseconds
fsck encountered internal errors!
Fsck on path '/' FAILED
------------------------------------------------------------------------------------------------------
What should I do? I am using Cloudera Standard 4.6.3 (#192 built by jenkins on 20130812-1221 git: fa61cf8559fbefeb5af7f223fd02164d1a0adfdb) on a 16GB RAM single CentOS machine.
Will be grateful for help.
Created 10-28-2013 08:54 AM
When you are running on a single machine, you must set the "replication" factor (dfs.replication) to 1, since the default is 3 and there are not 3 datanodes in your cluster, HDFS will just sit there trying to replicate blocks that it cannot. See below from your fsck output:
Default replication factor: 3
Under-replicated blocks: 126 (100.0 %)
If you restart the cluster with replication set to one, the cluster should report healthy again.
Created 11-03-2013 02:20 AM
Very sorry for not responding earlier even though the solution given by you had sorted out my problem and fsck ran without any
errors and reported healthy cluster.
My only grievance is why should Cloudera manager show a RED icon and inform us of unreplicated blocks even though the machine is single.
Thanks for help.
(Another reason for not responding earlier is that the Cloudera forum requires at least eight letter password with a combination of letters and digits. That makes remembering password more difficult unlike other forums. I had to create a password each time I logged in. This time I have stored it on my machine. )
Created 10-28-2013 08:54 AM
When you are running on a single machine, you must set the "replication" factor (dfs.replication) to 1, since the default is 3 and there are not 3 datanodes in your cluster, HDFS will just sit there trying to replicate blocks that it cannot. See below from your fsck output:
Default replication factor: 3
Under-replicated blocks: 126 (100.0 %)
If you restart the cluster with replication set to one, the cluster should report healthy again.
Created 11-03-2013 02:20 AM
Very sorry for not responding earlier even though the solution given by you had sorted out my problem and fsck ran without any
errors and reported healthy cluster.
My only grievance is why should Cloudera manager show a RED icon and inform us of unreplicated blocks even though the machine is single.
Thanks for help.
(Another reason for not responding earlier is that the Cloudera forum requires at least eight letter password with a combination of letters and digits. That makes remembering password more difficult unlike other forums. I had to create a password each time I logged in. This time I have stored it on my machine. )