Created 07-21-2022 05:14 AM
Hello everybody, basically there was an electric problem and the cluster was suddently shutdown.
After restarting everything Hbase results to have all the Region Servers online (but with 0 regions each) and the Region Server with the same names are shown in Dead Region Servers.
Everytime i restart hbase, new rows are add in the Dead Region Server .
This already happened to me long time ago and the problem was related to zookeeper, but i can't find the old post.
Do you know what i can do? Thanks
P.S. my cluster is kerberized, hbase version 2.0
Created 07-21-2022 07:24 AM
Hello @loridigia ,
It seems due to the outage there would be multiple ServerCrashProcedures created for the Regionservers. The Dead region severs with same names are different instances of the Region servers with a different epoch timestamp. As the Hbase Master was also down, it might be possible that it was not able to process the expiration of the Region servers. You might see some Crash procedures waiting to be finished under "Procedures & Locks" section of the Active Hbase Master Web UI.
As you have already solved this issue in the past involving zookeeper. I guess you can try this :
1. Stop Hbase
2. Login to zookeeper using #hbase zkcli ( with a valid hbase ticket )
3. Delete the /hbase-secure znode. rmr /hbase-secure
4. Sideline the entries under HDFS dir. hdfs dfs -mv /hbase/MasterProcWALs/* /tmp. ( Not sure if this was done earlier )
5. Start Hbase
Created 07-21-2022 07:24 AM
Hello @loridigia ,
It seems due to the outage there would be multiple ServerCrashProcedures created for the Regionservers. The Dead region severs with same names are different instances of the Region servers with a different epoch timestamp. As the Hbase Master was also down, it might be possible that it was not able to process the expiration of the Region servers. You might see some Crash procedures waiting to be finished under "Procedures & Locks" section of the Active Hbase Master Web UI.
As you have already solved this issue in the past involving zookeeper. I guess you can try this :
1. Stop Hbase
2. Login to zookeeper using #hbase zkcli ( with a valid hbase ticket )
3. Delete the /hbase-secure znode. rmr /hbase-secure
4. Sideline the entries under HDFS dir. hdfs dfs -mv /hbase/MasterProcWALs/* /tmp. ( Not sure if this was done earlier )
5. Start Hbase
Created 07-22-2022 01:03 AM
Hi @rki_and thanks for your answer, was exaclty was needed.
But, if i may ask, after that i see all regions server online, 0 offline and all regions on 1 region server execept for meta that is on another one (in total i have 3).
The problem is that i got this error in master:
org.apache.hadoop.hbase.NotServingRegionException: hbase:quota,,1620896369946.28dd7c81713c9347e8dfe4e6993b1ec7. is not online on my-server3.domain.com,16020,1658432084980
Do you have any idea on what could i do?
Thanks
Created 08-28-2024 04:07 AM
Can you please provide detailed commands to do this
2. Login to zookeeper using #hbase zkcli ( with a valid hbase ticket )
3. Delete the /hbase-secure znode. rmr /hbase-secure
Created 08-28-2024 04:30 AM
I am unable to locate /hbase-secure znode , which one should i delete have the same issue , I am just having /hbase znode
Created 07-22-2022 01:44 AM
Hello @loridigia
You can try to assign the region from hbase shell.
> assign '28dd7c81713c9347e8dfe4e6993b1ec7'
If you can attach the below command output (with valid ticket ), we can check which all regions are offiline or in transition.
# hbase hbck -details
Created 07-22-2022 02:29 AM
Hi RKI, the command worked, that error now is gone... but doing "hbase hbck -details" i goit 560 inconsistencies all equals:
ERROR: There is a hole in the region chain between and . You need to create a new .regioninfo and region dir in hdfs to plug the hole.
Created 07-22-2022 02:43 AM
Hi,
A hole in region chain most probably indicates there are some regions which are not yet online and hence creates a hole.
# cat hbck.report | grep "not deployed on any region server"
If you see regions in the above command output, you will need to assign them using hbase shell.
Created 07-22-2022 06:30 AM
You are a SAVIOUR !!
I made a script to assign all regions with ""not deployed on any region server"" and now it works fine!!
Awesome thanks a lot mate!