Created 01-06-2016 07:37 AM
We are having five nodes hadoop cluster powered by HDP (Version 2.1), Ambari (Version 1.6). We have 1 hbase(Version 0.98) master. 3 are data nodes which are having 3 region servers. We are having hbase application running on this.
For last couple of weeks, region server of data node2 was getting stopped arbitrarily without any error logs. What we observed is that region server was going down on weekly basis. But from last couple of days, region server of data node3 is also going down without any error logs.
Region server logs are as follows-
Log list-
-rw-r--r-- 1 hbase hadoop 191 Dec 29 18:17 hbase-hbase-regionserver-fsdata2c.corp.arc.com.out.2
-rw-r--r-- 1 hbase hadoop 814M Dec 29 18:17 gc.log-201511240826 -rw-r--r-- 1 hbase hadoop 191 Jan 4 18:27 hbase-hbase-regionserver-fsdata2c.corp.arc.com.out.1 -rw-r--r-- 1 hbase hadoop 186M Jan 4 18:27 gc.log-201512300433
[root@fsdata2c hbase]# more hbase-hbase-regionserver-fsdata2c.corp.arc.com.out.1 /usr/lib/hbase/bin/hbase-daemon.sh: line 197: 19217 Killed nice -n $HBASE_NICENESS "$HBASE_HOME"/bin/hbase --config "${HBASE_CONF_DIR }" $command "$@" start >> "$logout" 2>&1 [root@fsdata2c hbase]#
As of now, we are starting region server manually which is solving the problem on temporary basis until region server stops again.
We require permanent solutions. Can anyone please help on this issue.
Created 01-06-2016 01:41 PM
@Raja Ray are all standard requirements set, i.e. ulimit, swappiness? Also, can you check the disk health? Also, what OS are you running, in case of RPM based, do you have Transparent Huge Pages off?
Created 01-19-2016 12:12 PM
Excellent can you please accept the answer to close it out @Raja Ray