We are having five nodes hadoop cluster powered by HDP (Version 2.1), Ambari (Version 1.6). We have 1 hbase(Version 0.98) master. 3 are data nodes which are having 3 region servers. We are having hbase application running on this.
For last couple of weeks, region server of data node2 was getting stopped arbitrarily without any error logs. What we observed is that region server was going down on weekly basis. But from last couple of days, region server of data node3 is also going down without any error logs.
Region server logs are as follows-
Log list-
-rw-r--r-- 1 hbase hadoop 191 Dec 29 18:17
hbase-hbase-regionserver-fsdata2c.corp.arc.com.out.2
-rw-r--r-- 1 hbase hadoop 814M Dec 29 18:17 gc.log-201511240826
-rw-r--r-- 1 hbase hadoop 191 Jan 4 18:27
hbase-hbase-regionserver-fsdata2c.corp.arc.com.out.1
-rw-r--r-- 1 hbase hadoop 186M Jan 4 18:27 gc.log-201512300433
[root@fsdata2c hbase]# more
hbase-hbase-regionserver-fsdata2c.corp.arc.com.out.1
/usr/lib/hbase/bin/hbase-daemon.sh: line 197: 19217
Killed nice -n $HBASE_NICENESS "$HBASE_HOME"/bin/hbase
--config "${HBASE_CONF_DIR
}" $command "$@" start >> "$logout" 2>&1
[root@fsdata2c hbase]#
As of now, we are starting region server manually which is solving the problem on temporary basis until region server stops again.
We require permanent solutions. Can anyone please help on this issue.