Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase - Region in Transition

avatar
Explorer

Hi Folks,

             We are running into an issue where one of the region has been stuck in transition state for a few weeks.

 

8befbfc89993b7e57e7e21cf17113dfc state=OFFLINE, ts=1464546604126, server=null}

 

From the hdfs logs it loosk like one of the 3 files in the region directory is missing blocks .

/var/shn/data/hbase/data/default/user_counters/8befbfc89993b7e57e7e21cf17113dfc/stats-monthly/c4b15e7f297c498d8d935378804af732: CORRUPT blockpool BP-1312200060-10.0.4.237-1411698869524 block blk_1287541730

 MISSING 1 blocks of total size 663467 B

0. BP-1312200060-10.0.4.237-1411698869524:blk_1287541730_213952706 len=663467 MISSING!

 

If i remove just this one corrupted file and do assign 'region id' can i recover all my remaining files. please advise

 

Thank You,

Mastan

1 ACCEPTED SOLUTION

avatar
Mentor
NameNode by default would wait upto 10.5 minutes before declaring a non-heartbeating DataNode as dead and processing its block list as under-replicated.

P.s. Its better to open a new topic per question, it helps others searching for specific Q&A.

View solution in original post

3 REPLIES 3

avatar
Mentor
Yes you should be able to reassign the region with the bad file moved out
of its /hbase location (to somewhere like /tmp).

It may also be worth investigating how you ended up losing that block
altogether, especially if the file's timestamp is newer (your NN logs would
help, search the block ID in it).

avatar
Explorer

Thank You harsh  that worked,  Unfortunately the logs for NN have rolled over and no rca could be found.

 

This is possibly due to multiple node issues we had a month back or so..

 

On the same note i have a quick question.  Say if i'm taking down a node for patching , How long does NN wait before starting the replication of the under replicated blocks?

avatar
Mentor
NameNode by default would wait upto 10.5 minutes before declaring a non-heartbeating DataNode as dead and processing its block list as under-replicated.

P.s. Its better to open a new topic per question, it helps others searching for specific Q&A.