Support Questions

zack_riesland · ‎08-29-2016

Our cluster recently had some issue related to network outages.

When all the dust settled, Hbase eventually "healed" itself, and almost everything is back to working well, with a couple of exceptions.

In particular, we have one table where almost every query times out - which was never the case before. It's very small compared to most of our other tables at around 400 million rows.

(Clarification: we query via JDBC via Phoenix)

When I look at the GUI tools (like http://<my server>:16010/master-status#storeStats) it shows '1' under "offline regions" for that table (it has 33 total regions). Almost all the other tables show '0'.

Can anyone help me troubleshoot this?

I know there is a CLI tool for fixing HBase issues. I'm wondering whether that "offline region" is the cause of these timeouts.

If not, how I can I figure it out?

Thanks!

sjiang · ‎08-29-2016

You can use Master UI to find which region is offline.

To troubleshoot root cause, please share the master log and the region that is offline.

tyu · ‎08-29-2016

Zack:

You can use hfile tool to inspect:

MY_BROKEN_TABLE/8a444fa1979524e97eb002ce8aa2d7aa/0/4f9a5c26ddb0413aa4eb64a869ab4a2c

http://hbase.apache.org/book.html#hfile_tool

tyu · ‎08-29-2016

Zack:

Can you check other regions which failed to open (such as

a97029c18889b3b3168d11f910ef04ae

) ?

elkan1788 · ‎06-26-2017

I think you need check the folder access. There had two place you need check: `/var/log/hbase` and `/hadoop/hbase/local/jars/tmp/`. Also I had chown those folders under hbase the region start success. Try it and congratulate。

Cloudera Community

Support Questions

How to fix "offline regions" in HBase