Support Questions

Find answers, ask questions, and share your expertise

how to format HDFS in an already installed cluster, properly ?

avatar
Guru

Hi,

during installation procedure of a cluster I was facing some hw issues, so that at the end I now have a (almost) running cluster but with corrupt file blocks.

HDFS service is up and running in HA mode but it is complaining about corrupt blocks:

FSCK started by hdfs (auth:SIMPLE) from /10.41.27.10 for path / at Thu Jan 21 13:22:00 CET 2016
..............
/hdp/apps/2.2.4.2-2/hive/hive.tar.gz: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073741862


/hdp/apps/2.2.4.2-2/hive/hive.tar.gz: MISSING 1 blocks of total size 83000677 B..
/hdp/apps/2.2.4.2-2/mapreduce/hadoop-streaming.jar: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073741863


/hdp/apps/2.2.4.2-2/mapreduce/hadoop-streaming.jar: MISSING 1 blocks of total size 104996 B..
/hdp/apps/2.2.4.2-2/mapreduce/mapreduce.tar.gz: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073741827


/hdp/apps/2.2.4.2-2/mapreduce/mapreduce.tar.gz: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073741829


/hdp/apps/2.2.4.2-2/mapreduce/mapreduce.tar.gz: MISSING 2 blocks of total size 192697367 B..
/hdp/apps/2.2.4.2-2/pig/pig.tar.gz: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073741861


/hdp/apps/2.2.4.2-2/pig/pig.tar.gz: MISSING 1 blocks of total size 97548644 B..
/hdp/apps/2.2.4.2-2/tez/tez.tar.gz: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073741826


/hdp/apps/2.2.4.2-2/tez/tez.tar.gz: MISSING 1 blocks of total size 40658186 B..
/mr-history/done/2016/01/08/000000/job_1452263100546_0003-1452263260432-ambari%2Dqa-PigLatin%3ApigSmoke.sh-1452263277399-1-0-SUCCEEDED-default-1452263269870.jhist: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073742129



...

/user/ambari-qa/passwd: MISSING 1 blocks of total size 2637 B...
/user/ambari-qa/pigsmoke.out/part-v000-o000-r-00000: CORRUPT blockpool BP-1565025838-10.41.27.10-1452263064113 block blk_1073742141
/user/ambari-qa/pigsmoke.out/part-v000-o000-r-00000: MISSING 1 blocks of total size 358 B.Status: CORRUPT
 Total size:    414892275 B
 Total dirs:    7291
 Total files:   38
 Total symlinks:                0
 Total blocks (validated):      35 (avg. block size 11854065 B)
  ********************************
  CORRUPT FILES:        23
  MISSING BLOCKS:       24
  MISSING SIZE:         414887859 B
  CORRUPT BLOCKS:       24
  ********************************
 Minimally replicated blocks:   11 (31.428572 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     0.62857145
 Corrupt blocks:                24
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          4
 Number of racks:               1
FSCK ended at Thu Jan 21 13:22:00 CET 2016 in 157 milliseconds
The filesystem under path '/' is CORRUPT

What I want to do now is to re-format HDFS to start with a blank HDFS, since it is a new installation and no data has been uploaded to HDFS.

How can I properly re-format HDFS to get rid of the corrupt blocks ?

I am afraid of deleting just the files it is complaining about, because if I delete e.g. /hdp/apps/2.2.4.2-2/hive/hive.tar.gz will it be re-deployed at restarting services or how will those .gz and .jar's will be provided afterwards ?!?!

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Gerd Koenig For blank copy

hadoop namenode -format ( Don't use this in production or any env in use)

Now, re: Corrupt block -- see this http://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hadoop-hdfs

Official doc

Now, challenge is HA - I suggest to open a support case if you have access to support

View solution in original post

6 REPLIES 6

avatar
Master Mentor

@Gerd Koenig For blank copy

hadoop namenode -format ( Don't use this in production or any env in use)

Now, re: Corrupt block -- see this http://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hadoop-hdfs

Official doc

Now, challenge is HA - I suggest to open a support case if you have access to support

avatar
Master Mentor

@Gerd Koenig Open the support ticket to handle this ...I would be doing the same if I am in your shoes.

avatar
Guru

Thanks @Neeraj .

Just to give you feedback of another 'solution'. In the meantime I received two more datanodes back (which were failing during installation time). After adding those hosts and restarting HDFS the corrupt block error disappeared without any further file deletion or HDFS re-formatting

Regards, Gerd

avatar
Master Mentor
@Gerd Koenig

additionally, the files you're concerned with are distributed with our distribution, you can find them in /usr/hdp directory on your local filesystem.

avatar
Master Guru

@Gerd Koenig If you reformat hdfs you will be left without the whole /hdp folder and you'll have to recreate it. If you are sure everything else is now all right you better remove corrupted files and recreate them, they are all available in /usr/hdp/<hdp-version> and you can copy them to hdfs. Details can be found in the doc given by @Neeraj Sabharwal. For example, hive and pig files are given here, tez files here and so on. You can just delete files under /user/ambari-qa, they are result of some service checks, no need to recreate them.

avatar
Guru

DO NOT REFORMAT for missing blocks. If its not a test cluster, you need to identify how you ended up with missing blocks. One possible reason if you changed the data directories and removed some. If you identified the root cause and fine with it, just get the files missing from local and update into hdfs. And you can just delete the files in /user/ambari-qa that you listed.