Created on 09-28-2020 11:43 PM - edited 09-16-2022 07:38 AM
Hello,
We are facing a problem in our production environment. The Hbase shell command "remove_peer '<peer>'" is not working. The peer was previsously disabled ( took a very very long time to get back to the prompt ).
In our others environment, removing a peer was ok.
Any ideas of the problem and how to fix it?
Thanks for any help.
Regards
Daniel
Created 10-04-2020 11:57 PM
'abort_procedure' is deprecated and not available. Seems now it needs to use 'hbck2 --bypass' in order to abort a procedure.
We did not applied this because fortunately,
an upgrade to CDH 6.3.3 for one cluster and a yarn patch on the other one, made a HBase restart necessary. That removed the peers on both clusters.
Created 09-29-2020 01:34 AM
@dallanic Is there any error message you are getting?
You might want to look at various commands and try again, below doc can be useful.
https://learnhbase.net/2013/03/02/hbase-shell-commands/
Created 09-29-2020 01:43 AM
Error message is :
remove_peer '1'
ERROR: The procedure 8290 is still running
For usage try 'help "remove_peer"'
Took 610.7577 seconds
Thanks but here is no other commands to look for removing a peer. and the command is same we used in 2 others clusters, without any problem. Other commands like disabling the peer or getting peers informations are working fine.
Created 09-29-2020 04:52 AM
@dallanic It is indicating that the procedure that was run to remove the entry got hung for xyz reasons.
So there might be some issue with hosts/disk/memory/table itself.
You have to do a clean perform I would say. Try to kill hung process and then do.
Created 09-29-2020 05:13 AM
@GangWarI think you are right. Since I don't have sufficient rights, I will ask for a list_procedures and find the procedure number that may cause the problem; And ask to do a abort_procedure
Created 09-29-2020 06:21 AM
@dallanic Sure. Keep posted here and don't forget to mark this post as solution so that this will help other members as well.
Created 10-04-2020 11:57 PM
'abort_procedure' is deprecated and not available. Seems now it needs to use 'hbck2 --bypass' in order to abort a procedure.
We did not applied this because fortunately,
an upgrade to CDH 6.3.3 for one cluster and a yarn patch on the other one, made a HBase restart necessary. That removed the peers on both clusters.