Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hbase region server going down

Highlighted

hbase region server going down

I am frequently seeing the below message in the region server logs and the particular region server goes down. Is there any particular reason for that

2016-10-12 07:24:51,105 WARN [regionserver/hostname/10.107.107.152:16020] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=zk1:2181,zk2:2181,zk3:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-unsecure/rs/hostname,16020,1475939210143 2016-10-12 07:24:59,105 WARN [regionserver/hostname/10.107.107.152:16020] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=zk1:2181,zk2:2181,zk3:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-unsecure/rs/hostname,16020,1475939210143 2016-10-12 07:24:59,105 ERROR [regionserver/hostname/10.107.107.152:16020] zookeeper.RecoverableZooKeeper: ZooKeeper delete failed after 4 attempts 2016-10-12 07:24:59,105 WARN [regionserver/hostname/10.107.107.152:16020] regionserver.HRegionServer: Failed deleting my ephemeral node org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-unsecure/rs/hostname,16020,1475939210143 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:178) at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1210) at org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1403) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1079) at java.lang.Thread.run(Thread.java:745) 2016-10-12 07:24:59,108 INFO [regionserver/hostname/10.107.107.152:16020] regionserver.HRegionServer: stopping server hostname,16020,1475939210143; zookeeper connection closed. 2016-10-12 07:24:59,108 INFO [regionserver/hostname/10.107.107.152:16020] regionserver.HRegionServer: regionserver/hostname/10.107.107.152:16020 exiting 2016-10-12 07:24:59,108 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2651)

2 REPLIES 2
Highlighted

Re: hbase region server going down

Super Guru
Highlighted

Re: hbase region server going down

Check for:

1. JVM GC pauses. If the JVM is doing a stop-the-world garbage collection, it will cause the server to become disconnected from ZK, and likely lose its session. Read the lines in the HBase service log prior to this error.

2. Errors in the ZooKeeper log about maxClientCnxns (https://community.hortonworks.com/articles/51191/understanding-apache-zookeeper-connection-rate-lim.html)

3. Ensure operation system swappiness is reduced from the default (often 30 or 60), to a value of 0. You can inspect this via `cat /proc/sys/vm/swappiness`.

Don't have an account?
Coming from Hortonworks? Activate your account here