Support Questions

Find answers, ask questions, and share your expertise

strange HBase alert after HDP upgrade 2.2.4 => 2.3.4

avatar
Guru

Hello,

after upgrading HDP from 2.2.4 to 2.3.4.7 using Ambari 2.2.1.1 the service HBase shows a strange alert.

It notifies me about dead region servers, and in detail there shall be "7 out of 5 dead region servers"... hmmmm, really ?!? A total of "5" is correct, this is the number of Regionservers, and they are all up and running (and marked "green" in Ambari itself)

3496-hbase-state.png

3495-hbase-alert.png

How to get rid of that wrong alert notification ?!? ...just waiting a couple of days didn't solve the problem 😉

Thanks, Gerd

1 ACCEPTED SOLUTION

avatar
Master Guru

Next try: Login as user hdfs, and do "hdfs dfs -ls /apps/hbase/data/WALs".

  • Move all directories (5 of them?) that refer to old port 60020 to a folder under /user/hdfs and restart the HBase master.
  • In the same list, identify any directories having "splitting" in their name. If there are any, check that they contain only one file each, with "meta" in its name. Move all those "splitting" directories to /user/hdfs, and restart HBase.

View solution in original post

4 REPLIES 4

avatar
Master Guru

HBase ports changed in HDP-2.3.4, details here. And region servers has logical names which include RS port. An example RS name: "sandbox.hortonworks.com,16020,1460965964168". Before, the RS port was 60020, and now it's 16020. So, if you have 5 machines running RS, before and after the upgrade there are 10 RSs to be taken care of, and in your case HBase master may still think that RS names with 60020 are still being used. Restarting HBase (or just restarting HBase master) is supposed to remove them, and solve your issues. Before and after the restart you can check Regions servers in your Hbase Web UI (HBase --> Quick Links).

avatar
Guru

Hi @Predrag Minovic , thanks for answering.

The observation is exactly what is happening here, but even after restarting HBase multiple times this alert doesn't disappear. The output of the Dead Region servers still looks wired, because some of them are multiples and some are even not there =>

3498-hbase-regionserver-details.png

next try ?

avatar
Master Guru

Next try: Login as user hdfs, and do "hdfs dfs -ls /apps/hbase/data/WALs".

  • Move all directories (5 of them?) that refer to old port 60020 to a folder under /user/hdfs and restart the HBase master.
  • In the same list, identify any directories having "splitting" in their name. If there are any, check that they contain only one file each, with "meta" in its name. Move all those "splitting" directories to /user/hdfs, and restart HBase.

avatar
Guru

Brilliant @Predrag Minovic , that solved the issue. Thanks !