Created on 09-22-2014 04:54 AM - edited 09-16-2022 02:08 AM
Hi All,
I have installed CDH 5 (5.1.2) on a 2 node cluster on AWS VPC with ubuntu base OS, after completing my use I have shut the servers down.
When started again the manager was showing errors in starting HDFS and HBASE. The error showed on the HBASE was "HDFS Under replicated blocks". After having Googling I have found that the issue is with blocks "Missing Blocks / Corrupted Files" was the error shown there.
summary of the hadoop fsck /
Total size: 311766450 B
Total dirs: 656
Total files: 215
Total symlinks: 0
Total blocks (validated): 213 (avg. block size 1463692 B)
********************************
CORRUPT FILES: 105
MISSING BLOCKS: 105
MISSING SIZE: 118118945 B
CORRUPT BLOCKS: 105
********************************
Minimally replicated blocks: 108 (50.704224 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 43 (20.187794 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.0140845
Corrupt blocks: 105
Missing replicas: 43 (8.431373 %)
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Mon Sep 22 08:06:08 UTC 2014 in 155 milliseconds
----------------------------------------------------------------------------------
The filesystem under path '/' is CORRUPT
I have followed the instructions in
http://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hadoop-hdfs
https://www.packtpub.com/books/content/managing-hadoop-cluster (hadoop fsck -delete)
After executing the command (hadoop fsck -delete) the HBase started. While trying to start the HDFS it showing an error "HDFS error: could only be replicated to 0 nodes, instead of 1"
Please help me to fix this out.
My concenrs: Is it possible to shutdown the cluster after the usage?
If it is possible which are the configuration which we need to take care during the installation
Created 10-08-2014 11:49 AM
Hi Team,
I got a solution.
When we are selecting an instance with instance store for configuring CDH, the log files will be automatically stored to the instance store. While we stops the instance the data / logs in the instance store will be deleted and that results to showing error " Missing Blocks".
For avoiding this we need to remove instance store while launching instance or we need to change the log location to EBS volume manually after completing installation. I think its better to remove the instance store while launching the instance.
Thanks to all you..
Cheers!!!!
Created 09-22-2014 07:54 PM
Created 09-24-2014 06:59 AM
Hey Gautam,
Thanks for the quick response. Please find my responses below.
"Missing Blocks" implies the datanodes which had block before shutdown now don't have it when they booted up. This could happen with the Instance Store. What kind of storage did you use on the nodes? This is explained here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Storage.html
I have configured using EBS instead of instance store. We need to shut down the instance after the usage (since the application is not yet expossed to live). My working scenatio is
In my case, issue is once I start the cluster (instances) after 2 - 3 days (after shutting down) there will be showing the errors of missing blocks. Will shutdown and starting the server according to our use makes any issue in CDH5.x.x?
When you run "hadoop fsck -delete" you are telling the namenode to delete files whose blocks cannot be located. This is fine for temporary files. Before running it however you should run "hdfs fsck -list-corruptfileblocks", identify the reason why the blocks are missing. If the blocks are recoverable, you won't have to delete the files themselves.
Ok, but the HBASE wont comming up with out resolving this missing block issue. Is there any other method to fix this missing block?
"could only be replicated to 0 nodes, instead of 1" could mean the datanodes are not healthy. Check the datanode logs under /var/log/hadoop-hdfs on both nodes to see what the problem might be. If it's not clear, paste the relevant parts to pastebin and give us the URL
This happens after running the "hadoop fsck -delete" command
Created 09-24-2014 07:15 AM
Hi Gautam,
Thanks for your quick response, please find my answers below.
"HDFS Under replicated blocks" implies that some blocks are not duplicated enough to satisfy the default replication factor of 3. If possible consider setting up clusters with at least 3 nodes.
As of now our requirement donot need to have 3 nodes.
"Missing Blocks" implies the datanodes which had block before shutdown now don't have it when they booted up. This could happen with the Instance Store. What kind of storage did you use on the nodes? This is explained here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Storage.html
We are configured the entire environment in EBS volumes. Our working scenario is,
This missing block is showing when we are starting the cluster after shutting down the same for a duration of 2 - 3 days
When you run "hadoop fsck -delete" you are telling the namenode to delete files whose blocks cannot be located. This is fine for temporary files. Before running it however you should run "hdfs fsck -list-corruptfileblocks", identify the reason why the blocks are missing. If the blocks are recoverable, you won't have to delete the files themselves.
The Hbase won't be starting without executing "hadoop fsck -delete" command and the "hdfs fsck -list-corruptfileblocks" out shows around 105 missing blocks. The missing block navigation (path to the block) showing the date stamp of the time of shutdown. Is that means we are not allowed to shutdown and start the cluster according to our requirement?
"could only be replicated to 0 nodes, instead of 1" could mean the datanodes are not healthy. Check the datanode logs under /var/log/hadoop-hdfs on both nodes to see what the problem might be. If it's not clear, paste the relevant parts to pastebin and give us the URL
This error is happening after running the command "hadoop fsck -delete" . After this command executaion the Hbase will be starting up and the HDFS will be showing the error "could only be replicated to 0 nodes, instead of 1"
Our ultimate goal is
Please let us know is the above scenario is possible or not in CDH 5.X.X.
Thanks in advance
Akash.
Created 10-01-2014 05:41 AM
Any suggestion to fix this issue.. 🙂
@Trinity wrote:Hi Gautam,
Thanks for your quick response, please find my answers below.
"HDFS Under replicated blocks" implies that some blocks are not duplicated enough to satisfy the default replication factor of 3. If possible consider setting up clusters with at least 3 nodes.
As of now our requirement donot need to have 3 nodes.
"Missing Blocks" implies the datanodes which had block before shutdown now don't have it when they booted up. This could happen with the Instance Store. What kind of storage did you use on the nodes? This is explained here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Storage.html
We are configured the entire environment in EBS volumes. Our working scenario is,
- Cluster with 2 nodes
- After making changes we need to shutdown the instances (since the application is in the developement stage).
- When we need to perform the developement we will be starting the cluster and perfom the changes.
This missing block is showing when we are starting the cluster after shutting down the same for a duration of 2 - 3 days
When you run "hadoop fsck -delete" you are telling the namenode to delete files whose blocks cannot be located. This is fine for temporary files. Before running it however you should run "hdfs fsck -list-corruptfileblocks", identify the reason why the blocks are missing. If the blocks are recoverable, you won't have to delete the files themselves.
The Hbase won't be starting without executing "hadoop fsck -delete" command and the "hdfs fsck -list-corruptfileblocks" out shows around 105 missing blocks. The missing block navigation (path to the block) showing the date stamp of the time of shutdown. Is that means we are not allowed to shutdown and start the cluster according to our requirement?
"could only be replicated to 0 nodes, instead of 1" could mean the datanodes are not healthy. Check the datanode logs under /var/log/hadoop-hdfs on both nodes to see what the problem might be. If it's not clear, paste the relevant parts to pastebin and give us the URL
This error is happening after running the command "hadoop fsck -delete" . After this command executaion the Hbase will be starting up and the HDFS will be showing the error "could only be replicated to 0 nodes, instead of 1"
Our ultimate goal is
- create a cluster with 2 nodes
- Shutdown cluster after completing my tasks
- Start the cluster when ever we need to make changes or demo purpose.
Please let us know is the above scenario is possible or not in CDH 5.X.X.
Thanks in advance
Akash.
Created 10-08-2014 11:49 AM
Hi Team,
I got a solution.
When we are selecting an instance with instance store for configuring CDH, the log files will be automatically stored to the instance store. While we stops the instance the data / logs in the instance store will be deleted and that results to showing error " Missing Blocks".
For avoiding this we need to remove instance store while launching instance or we need to change the log location to EBS volume manually after completing installation. I think its better to remove the instance store while launching the instance.
Thanks to all you..
Cheers!!!!