Created on 03-29-2016 09:40 PM - edited 09-16-2022 03:11 AM
Hello,
I had a problem, my job failed because hbase could not find an existing table.
I then did a:
sudo -u hbase hbase hbck -repair
and now all my tables are gone (beside one)!!
I cannot see my old data in the hbase folder! Is there a way to recover all this?
Please help!
Thank you!
Created 04-11-2016 10:44 PM
So this is 13 days late, so I'm imagining you already have a solution or have moved on. Commenting for future searchers.
As you have learned the hard way, the only "safe" hbck option is the -fixAssignments option. every other option is potentially dangerous. Having said that, I've experienced -fixAssignments multiply assign regions if there are regions in transition. this can be fixed by running -fixAssignments again, or by failing over the HMaster and allowing the assignment manager to fix it.
Unfortunately there isn't a single way that the -repair option could have trashed things. There are 3 "brains" for Hbase, there is the data in HDFS under /hbase for each region in the .regioninfo file, the zookeeper data, and the Hbase .META. region. Depending on which one having a problem will determine which option is the proper one to get fixed. When having an issue, The first step is to figure out which one of these is incorrect. This is a deep topic so i've added some helpful links to read more on this[1][2]
Based on all of your tables disappearing, but presumably being able to read it before I would assume that There were holes detected for basically all tables, and detected from a previous region state with different splits. hbase decided to fix it by filling those "holes" with empty regions, deleting the existing regions.
To recover from that, You would shut down hbase, move the table files from .Trash in hdfs back to the original location and do an offline meta repair. hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair. From there you bring up hbase and troubleshoot any issues from there. Again, This would only be proper based on the assumption above being correct, and would absolutely not be the proper action if HDFS were corrupt and it was Meta that was correct.
[1]http://hbase.apache.org/book.html#_region_overlap_repairs
[2]http://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hbck_poller.html
Created 04-11-2016 10:44 PM
So this is 13 days late, so I'm imagining you already have a solution or have moved on. Commenting for future searchers.
As you have learned the hard way, the only "safe" hbck option is the -fixAssignments option. every other option is potentially dangerous. Having said that, I've experienced -fixAssignments multiply assign regions if there are regions in transition. this can be fixed by running -fixAssignments again, or by failing over the HMaster and allowing the assignment manager to fix it.
Unfortunately there isn't a single way that the -repair option could have trashed things. There are 3 "brains" for Hbase, there is the data in HDFS under /hbase for each region in the .regioninfo file, the zookeeper data, and the Hbase .META. region. Depending on which one having a problem will determine which option is the proper one to get fixed. When having an issue, The first step is to figure out which one of these is incorrect. This is a deep topic so i've added some helpful links to read more on this[1][2]
Based on all of your tables disappearing, but presumably being able to read it before I would assume that There were holes detected for basically all tables, and detected from a previous region state with different splits. hbase decided to fix it by filling those "holes" with empty regions, deleting the existing regions.
To recover from that, You would shut down hbase, move the table files from .Trash in hdfs back to the original location and do an offline meta repair. hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair. From there you bring up hbase and troubleshoot any issues from there. Again, This would only be proper based on the assumption above being correct, and would absolutely not be the proper action if HDFS were corrupt and it was Meta that was correct.
[1]http://hbase.apache.org/book.html#_region_overlap_repairs
[2]http://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hbck_poller.html
Created 04-12-2016 08:17 AM