Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase Region in FAILED_OPEN state due to FileNotFoundException

avatar
Expert Contributor

I have an 8 node cluster running HDP 2.4. I currently have 4 regions on a large table that are stuck in the FAILED_OPEN state. When I check the logs for the regions servers I see that there is a FileNotFoundException, indicating (I believe) that the HFile does not exist. I have tried an OfflineMetaRepair in order to remove the entries but this did not help. The directories for these regions exist, but they do not contain any data. Can anybody suggest a way to repair this? If I need to perform manual surgery on the META file, can someone guide me to do this correctly?

1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
5 REPLIES 5

avatar

Hi @Mark Heydenrych

This can happen if RS went down during region splitting(this got fixed in latest versions). You need to sideline reference files of the region which is FAILED_OPEN and restart the RS. If you share the logs we can suggest you which files to be sidelined.

Thanks,

Rajeshbabu.

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Expert Contributor
@Rajeshbabu Chintaguntla

Thank you so much for your help. The FileNotFound was referencing a different region to the one it was loading, and the issue was due to reference files. I moved each of these directories out of /apps/hbase (there were only four, so it was easy). After that I ran OfflineMetaRepair. Once I started HBase it loaded every region as it should. As a precaution I ran hbase hbck -repair and hbase hbck -repairHoles after this, and everything is fine now. Data is available for both reading and writing, and there are no regions in transition. Once again, thank you for your help.

avatar
Guru

What Rajesh said above is right. You can use

hbase hbck -repair

to automatically fix the issue. In recent versions of HDP-2.4, you should not have experienced the bug that might cause this, but there maybe something else wrong. Did you check whether HDFS is healthy?

avatar
Expert Contributor

I had already tried hbase hbck -repair as well as -repairHoles prior to posting the question, with no success. We had some problems with HDFS preceding this issue. HDFS showed itself as healthy, but it had previously been corrupt. I believe this was the underlying cause of the issue. We now have HBase stable again. I added a comment to the accepted answer explaining how I solved the issue on my side. Thanks for the help.