Created 07-14-2016 10:41 AM
I have an HBase cluster which is having a major problem at the moment. My namespace and META tables appear to be working correctly. However the regions for my table are not being deployed on region servers. Instead they become stuck in the FAILED_OPEN state, often for longer than 20 minutes. Since they are classed as regions in transition balancing fails and cannot help. I have searched the log and there doesn't seem to be anything useful. I have tried the following:
None of these has helped. I have checked that HDFS is not corrupt, hdfs fsck / says it's healthy.
When running hbase hbck -details table_name, the only inconsistencies listed are the fact that the regions are not deployed. I saw a recommendation online and followed it, doing the following:
1. Stop HBase
2. Use a zookeeper cli and run "rmr /hbase" to delete the HBase znodes
3. Run offlineMetaRepair
4. Restart HBase. It will recreate the znodes
This still does not solve my problem. Is there anything more that anybody can suggest? I don't want to truncate the tables since we have over 2 months worth of data which we need to keep
Created 07-14-2016 10:49 AM
Can you find encoded region name for regions stuck in FAILED_OPEN state and pastebin related region server log ?
It would help us understand what caused the region not to open.
BTW which HDP version are you running ?
Created 07-14-2016 10:49 AM
Can you find encoded region name for regions stuck in FAILED_OPEN state and pastebin related region server log ?
It would help us understand what caused the region not to open.
BTW which HDP version are you running ?
Created 07-14-2016 11:22 AM
You inadvertently solved my problem. I had not seen that the HBase Master tells you which server it is trying to load to. I pulled up region server logs and found the following line:
org.apache.hadoop.security.AccessControlException: Permission denied
We had mistakenly changed the owner of /apps/hbase to hdfs, meaning that the hbase user could not write. We did hdfs dfs -chown -R hbase /apps/hbase and this has allowed the regions to be correctly assigned. Really appreciate your help.
For what it's worth, we're running HDP 2.4
Created 07-14-2016 10:56 AM
check your master logs , search for region which is in transition and get the regionserver info where it is trying to open a region and check the logs of that regionserver.
If the logs are normal and you find all your regions (in transition) are failing to open on one regionserver only, then stop it and see if regions open properly