Created 10-07-2016 11:15 AM
All service are up after a restart. But this is a recurring issue. So posting the log to check if this is bug or configuration issue.Habse master and regionserver goes down with following log.
-------------------------------
2016-10-07 03:01:02,867 INFO [imp1tvhdpmst1.corp.test.com,16000,1475152605958_splitLogManager__ChoreService_1] master.SplitLogManager$TimeoutMonitor: Chore: SplitLogManager Timeout Monitor missed its start time 2016-10-07 03:01:02,898 INFO [main-SendThread(imp1tvhdpmst2.corp.test.com:2181)] zookeeper.ClientCnxn: Client session timed out, have not heard from server in 65745ms for sessionid 0x1572d7e3442032c, closing socket connection and attempting reconnect 2016-10-07 03:01:02,898 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x3572d7e3c28020c, likely server has closed socket, closing socket connection and attempting reconnect 2016-10-07 03:01:02,902 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x3572d7e3c28020b, likely server has closed socket, closing socket connection and attempting reconnect 2016-10-07 03:01:02,902 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst3.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x2572d7e364901d1, likely server has closed socket, closing socket connection and attempting reconnect 2016-10-07 03:01:03,096 INFO [main-SendThread(imp1tvhdpmst4.corp.test.com:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2016-10-07 03:01:03,096 INFO [main-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server imp1tvhdpmst4.corp.test.com/172.24.125.133:2181. Will attempt to SASL-authenticate using Login Context section 'Client' 2016-10-07 03:01:03,096 INFO [main-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Socket connection established to imp1tvhdpmst4.corp.test.com/172.24.125.133:2181, initiating session 2016-10-07 03:01:03,100 INFO [main-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x1572d7e3442032c has expired, closing socket connection 2016-10-07 03:01:03,100 FATAL [main-EventThread] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor] 2016-10-07 03:01:03,101 FATAL [main-EventThread] master.HMaster: master:16000-0x1572d7e3442032c, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure master:16000-0x1572d7e3442032c received expired from ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:03,102 INFO [main-EventThread] regionserver.HRegionServer: STOPPED: master:16000-0x1572d7e3442032c, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure master:16000-0x1572d7e3442032c received expired from ZooKeeper, aborting 2016-10-07 03:01:03,102 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-10-07 03:01:03,102 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] util.Sleeper: We slept 63817ms instead of 3000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2016-10-07 03:01:03,102 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] regionserver.HRegionServer: Stopping infoServer 2016-10-07 03:01:03,124 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:16010 2016-10-07 03:01:03,174 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst3.corp.test.com:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2016-10-07 03:01:03,174 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst3.corp.test.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server imp1tvhdpmst3.corp.test.com/172.24.125.132:2181. Will attempt to SASL-authenticate using Login Context section 'Client' 2016-10-07 03:01:03,175 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst3.corp.test.com:2181)] zookeeper.ClientCnxn: Socket connection established to imp1tvhdpmst3.corp.test.com/172.24.125.132:2181, initiating session 2016-10-07 03:01:03,176 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst3.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x3572d7e3c28020c has expired, closing socket connection 2016-10-07 03:01:03,176 WARN [imp1tvhdpmst1:16000.activeMasterManager-EventThread] client.ConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:03,177 INFO [imp1tvhdpmst1:16000.activeMasterManager-EventThread] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3572d7e3c28020c 2016-10-07 03:01:03,177 INFO [imp1tvhdpmst1:16000.activeMasterManager-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-10-07 03:01:03,224 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] procedure2.ProcedureExecutor: Stopping the procedure executor 2016-10-07 03:01:03,225 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] wal.WALProcedureStore: Stopping the WAL Procedure Store 2016-10-07 03:01:03,272 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] regionserver.HRegionServer: stopping server imp1tvhdpmst1.corp.test.com,16000,1475152605958 2016-10-07 03:01:03,272 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3572d7e3c28020b 2016-10-07 03:01:03,452 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst4.corp.test.com:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2016-10-07 03:01:03,452 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server imp1tvhdpmst4.corp.test.com/172.24.125.133:2181. Will attempt to SASL-authenticate using Login Context section 'Client' 2016-10-07 03:01:03,452 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Socket connection established to imp1tvhdpmst4.corp.test.com/172.24.125.133:2181, initiating session 2016-10-07 03:01:03,454 INFO [imp1tvhdpmst1:16000.activeMasterManager-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x2572d7e364901d1 has expired, closing socket connection 2016-10-07 03:01:03,454 INFO [imp1tvhdpmst1:16000.activeMasterManager-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-10-07 03:01:03,499 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.ZooKeeper: Session: 0x3572d7e3c28020b closed 2016-10-07 03:01:03,499 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] regionserver.HRegionServer: stopping server imp1tvhdpmst1.corp.test.com,16000,1475152605958; all regions closed. 2016-10-07 03:01:03,499 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-10-07 03:01:03,500 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] hbase.ChoreService: Chore service for: imp1tvhdpmst1.corp.test.com,16000,1475152605958 had [[ScheduledChore: Name: HFileCleaner Period: 60000 Unit: MILLISECONDS], [ScheduledChore: Name: imp1tvhdpmst1.corp.test.com,16000,1475152605958-BalancerChore Period: 300000 Unit: MILLISECONDS], [ScheduledChore: Name: CatalogJanitor-imp1tvhdpmst1:16000 Period: 300000 Unit: MILLISECONDS], [ScheduledChore: Name: imp1tvhdpmst1.corp.test.com,16000,1475152605958-RegionNormalizerChore Period: 1800000 Unit: MILLISECONDS], [ScheduledChore: Name: LogsCleaner Period: 60000 Unit: MILLISECONDS], [ScheduledChore: Name: imp1tvhdpmst1.corp.test.com,16000,1475152605958-ClusterStatusChore Period: 60000 Unit: MILLISECONDS]] on shutdown 2016-10-07 03:01:03,501 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master 2016-10-07 03:01:03,635 ERROR [PriorityRpcServer.handler=8,queue=0,port=16000] master.MasterRpcServices: Region server imp1tvhdpslv4.corp.test.com,16020,1475152621893 reported a fatal error: ABORTING region server imp1tvhdpslv4.corp.test.com,16020,1475152621893: regionserver:16020-0x1572d7e3442032e, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure regionserver:16020-0x1572d7e3442032e received expired from ZooKeeper, aborting Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:03,635 ERROR [PriorityRpcServer.handler=10,queue=0,port=16000] master.MasterRpcServices: Region server imp1tvhdpslv1.corp.test.com,16020,1475152617424 reported a fatal error: ABORTING region server imp1tvhdpslv1.corp.test.com,16020,1475152617424: regionserver:16020-0x3572d7e3c28020d, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure regionserver:16020-0x3572d7e3c28020d received expired from ZooKeeper, aborting Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:03,867 INFO [imp1tvhdpmst1.corp.test.com,16000,1475152605958_splitLogManager__ChoreService_1] master.SplitLogManager$TimeoutMonitor: Chore: SplitLogManager Timeout Monitor was stopped 2016-10-07 03:01:04,501 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master 2016-10-07 03:01:06,502 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master 2016-10-07 03:01:10,509 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master 2016-10-07 03:01:18,032 ERROR [PriorityRpcServer.handler=13,queue=1,port=16000] master.MasterRpcServices: Region server imp1tvhdpslv2.corp.test.com,16020,1475152618858 reported a fatal error: ABORTING region server imp1tvhdpslv2.corp.test.com,16020,1475152618858: Get list of registered region servers Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/rs at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getChildren(RecoverableZooKeeper.java:295) at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:454) at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchThem(ZKUtil.java:482) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl.getRegisteredRegionServers(ReplicationTrackerZKImpl.java:245) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl.refreshOtherRegionServersList(ReplicationTrackerZKImpl.java:226) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl.access$400(ReplicationTrackerZKImpl.java:42) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl$OtherRegionServerWatcher.refreshListIfRightPath(ReplicationTrackerZKImpl.java:142) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl$OtherRegionServerWatcher.nodeDeleted(ReplicationTrackerZKImpl.java:117) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:539) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:18,033 ERROR [PriorityRpcServer.handler=11,queue=1,port=16000] master.MasterRpcServices: Region server imp1tvhdpslv3.corp.test.com,16020,1475152617489 reported a fatal error: ABORTING region server imp1tvhdpslv3.corp.test.com,16020,1475152617489: Get list of registered region servers Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/rs at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getChildren(RecoverableZooKeeper.java:295) at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:454) at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchThem(ZKUtil.java:482) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl.getRegisteredRegionServers(ReplicationTrackerZKImpl.java:245) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl.refreshOtherRegionServersList(ReplicationTrackerZKImpl.java:226) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl.access$400(ReplicationTrackerZKImpl.java:42) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl$OtherRegionServerWatcher.refreshListIfRightPath(ReplicationTrackerZKImpl.java:142) at org.apache.hadoop.hbase.replication.ReplicationTrackerZKImpl$OtherRegionServerWatcher.nodeDeleted(ReplicationTrackerZKImpl.java:117) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:539) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:18,336 ERROR [PriorityRpcServer.handler=7,queue=1,port=16000] master.MasterRpcServices: Region server imp1tvhdpslv2.corp.test.com,16020,1475152618858 reported a fatal error: ABORTING region server imp1tvhdpslv2.corp.test.com,16020,1475152618858: regionserver:16020-0x2572d7e364901d4, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure regionserver:16020-0x2572d7e364901d4 received expired from ZooKeeper, aborting Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:18,372 ERROR [PriorityRpcServer.handler=6,queue=0,port=16000] master.MasterRpcServices: Region server imp1tvhdpslv3.corp.test.com,16020,1475152617489 reported a fatal error: ABORTING region server imp1tvhdpslv3.corp.test.com,16020,1475152617489: regionserver:16020-0x1572d7e3442032d, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure regionserver:16020-0x1572d7e3442032d received expired from ZooKeeper, aborting Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:18,509 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master 2016-10-07 03:01:18,510 ERROR [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: ZooKeeper getData failed after 4 attempts 2016-10-07 03:01:18,510 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.ZKUtil: master:16000-0x1572d7e3442032c, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure Unable to get data of znode /hbase-secure/master org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:621) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267) at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1175) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1071) at java.lang.Thread.run(Thread.java:745) 2016-10-07 03:01:18,510 ERROR [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.ZooKeeperWatcher: master:16000-0x1572d7e3442032c, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:621) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267) at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1175) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1071) at java.lang.Thread.run(Thread.java:745) 2016-10-07 03:01:18,510 ERROR [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] master.ActiveMasterManager: master:16000-0x1572d7e3442032c, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure Error deleting our own master address node org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:621) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267) at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1175) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1071) at java.lang.Thread.run(Thread.java:745) 2016-10-07 03:01:18,512 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] hbase.ChoreService: Chore service for: imp1tvhdpmst1.corp.test.com,16000,1475152605958_splitLogManager_ had [] on shutdown 2016-10-07 03:01:18,512 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] flush.MasterFlushTableProcedureManager: stop: server shutting down. 2016-10-07 03:01:18,512 INFO [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] ipc.RpcServer: Stopping server on 16000 2016-10-07 03:01:18,512 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/tokenauth/keymaster 2016-10-07 03:01:19,512 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/tokenauth/keymaster 2016-10-07 03:01:21,512 WARN [master/imp1tvhdpmst1.corp.test.com/172.24.125.130:16000] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase-secure/tokenauth/keymaster 2016-10-07 03:01:23,132 WARN [LeaseRenewer:hbase@imp1tvhdpmst3.corp.test.com:8020] hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_-1572577966_107] for 30 seconds. Will retry shortly ... org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1932) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewLease(FSNamesystem.java:4534) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewLease(NameNodeRpcServer.java:1089) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewLease(ClientNamenodeProtocolServerSideTranslatorPB.java:660) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2200) at org.apache.hadoop.ipc.Client.call(Client.java:1426) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy19.renewLease(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:592) at sun.reflect.GeneratedMethodAccessor124.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy20.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:892) at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417) at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442) at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71) at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298) at java.lang.Thread.run(Thread.java:745) 2016-10-07 03:01:24,135 WARN [LeaseRenewer:hbase@imp1tvhdpmst3.corp.test.com:8020] hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_-1572577966_107] for 31 seconds. Will retry shortly ... org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1932) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewLease(FSNamesystem.java:4534) Regionserver Log: 2016-10-07 03:01:02,894 INFO [main-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Client session timed out, have not heard from server in 71395ms for sessionid 0x3572d7e3c28020d, closing socket connection and attempting reconnect 2016-10-07 03:01:02,899 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020-SendThread(imp1tvhdpmst3.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x2572d7e364901d2, likely server has closed socket, closing socket connection and attempting reconnect 2016-10-07 03:01:03,603 INFO [main-SendThread(imp1tvhdpmst2.corp.test.com:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2016-10-07 03:01:03,603 INFO [main-SendThread(imp1tvhdpmst2.corp.test.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server imp1tvhdpmst2.corp.test.com/172.24.125.131:2181. Will attempt to SASL-authenticate using Login Context section 'Client' 2016-10-07 03:01:03,604 INFO [main-SendThread(imp1tvhdpmst2.corp.test.com:2181)] zookeeper.ClientCnxn: Socket connection established to imp1tvhdpmst2.corp.test.com/172.24.125.131:2181, initiating session 2016-10-07 03:01:03,608 INFO [main-SendThread(imp1tvhdpmst2.corp.test.com:2181)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x3572d7e3c28020d has expired, closing socket connection 2016-10-07 03:01:03,608 FATAL [main-EventThread] regionserver.HRegionServer: ABORTING region server imp1tvhdpslv1.corp.test.com,16020,1475152617424: regionserver:16020-0x3572d7e3c28020d, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure regionserver:16020-0x3572d7e3c28020d received expired from ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2016-10-07 03:01:03,608 FATAL [main-EventThread] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint, org.apache.hadoop.hbase.security.token.TokenProvider, org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint, org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor] 2016-10-07 03:01:03,623 INFO [main-EventThread] regionserver.HRegionServer: Dump of metrics as JSON on abort: { "beans" : [ { "name" : "java.lang:type=Memory", "modelerType" : "sun.management.MemoryImpl", "NonHeapMemoryUsage" : { "committed" : 142868480, "init" : 136773632, "max" : 184549376, "used" : 63971032 }, "ObjectPendingFinalizationCount" : 0, "Verbose" : true, "HeapMemoryUsage" : { "committed" : 8536260608, "init" : 8589934592, "max" : 8536260608, "used" : 795318008 }, "ObjectName" : "java.lang:type=Memory" } ], "beans" : [ { "name" : "Hadoop:service=HBase,name=RegionServer,sub=IPC", "modelerType" : "RegionServer,sub=IPC", "tag.Context" : "regionserver", "tag.Hostname" : "imp1tvhdpslv1.corp.test.com", "queueSize" : 0, "numCallsInGeneralQueue" : 0, "numCallsInReplicationQueue" : 0, "numCallsInPriorityQueue" : 0, "numOpenConnections" : 0, "numActiveHandler" : 0, "TotalCallTime_num_ops" : 6766, "TotalCallTime_min" : 0, "TotalCallTime_max" : 4398, "TotalCallTime_mean" : 22.693319538870824, "TotalCallTime_median" : 0.0, "TotalCallTime_75th_percentile" : 1.0, "TotalCallTime_95th_percentile" : 85.0, "TotalCallTime_99th_percentile" : 251.58999999999992, "exceptions.FailedSanityCheckException" : 0, "exceptions.RegionMovedException" : 0, "QueueCallTime_num_ops" : 6766, "QueueCallTime_min" : 0, "QueueCallTime_max" : 25, "QueueCallTime_mean" : 0.11409991132131245, "QueueCallTime_median" : 0.0, "QueueCallTime_75th_percentile" : 0.0, "QueueCallTime_95th_percentile" : 1.0, "QueueCallTime_99th_percentile" : 1.0, "authenticationFailures" : 0, "authorizationFailures" : 0, "exceptions" : 6, "authenticationSuccesses" : 74, "authorizationSuccesses" : 74, "ProcessCallTime_num_ops" : 6766, "ProcessCallTime_min" : 0, "ProcessCallTime_max" : 4398, "ProcessCallTime_mean" : 22.579219627549513, "ProcessCallTime_median" : 0.0, "ProcessCallTime_75th_percentile" : 1.0, "ProcessCallTime_95th_percentile" : 85.0, "ProcessCallTime_99th_percentile" : 251.58999999999992, "exceptions.NotServingRegionException" : 6, "sentBytes" : 82090551, "exceptions.RegionTooBusyException" : 0, "receivedBytes" : 205125313, "exceptions.OutOfOrderScannerNextException" : 0, "exceptions.UnknownScannerException" : 0 } ], "beans" : [ { "name" : "Hadoop:service=HBase,name=RegionServer,sub=Replication", "modelerType" : "RegionServer,sub=Replication", "tag.Context" : "regionserver", "tag.Hostname" : "imp1tvhdpslv1.corp.test.com", "sink.appliedOps" : 0, "sink.appliedBatches" : 0, "sink.ageOfLastAppliedOp" : 0 } ], "beans" : [ { "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server", "modelerType" : "RegionServer,sub=Server", "tag.zookeeperQuorum" : "imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181", "tag.serverName" : "imp1tvhdpslv1.corp.test.com,16020,1475152617424", "tag.clusterId" : "af4af9f4-724e-42a0-9b08-c3ec0b260656", "tag.Context" : "regionserver", "tag.Hostname" : "imp1tvhdpslv1.corp.test.com", "regionCount" : 12, "storeCount" : 12, "hlogFileCount" : 2, "hlogFileSize" : 0, "storeFileCount" : 11, "memStoreSize" : 5088, "storeFileSize" : 22766460359, "regionServerStartTime" : 1475152617424, "totalRequestCount" : 3028580, "readRequestCount" : 6564, "writeRequestCount" : 252, "checkMutateFailedCount" : 0, "checkMutatePassedCount" : 0, "storeFileIndexSize" : 17472, "staticIndexSize" : 19363382, "staticBloomSize" : 271565340, "mutationsWithoutWALCount" : 0, "mutationsWithoutWALSize" : 0, "percentFilesLocal" : 66, "percentFilesLocalSecondaryRegions" : 0, "splitQueueLength" : 0, "compactionQueueLength" : 0, "flushQueueLength" : 0, "blockCacheFreeSize" : 3410842632, "blockCacheCount" : 3, "blockCacheSize" : 3661560, "blockCacheHitCount" : 3171629, "blockCacheHitCountPrimary" : 3171629, "blockCacheMissCount" : 51343, "blockCacheMissCountPrimary" : 51343, "blockCacheEvictionCount" : 4600, "blockCacheEvictionCountPrimary" : 4600, "blockCacheCountHitPercent" : 98.0, "blockCacheExpressHitPercent" : 99, "updatesBlockedTime" : 0, "flushedCellsCount" : 252, "compactedCellsCount" : 0, "majorCompactedCellsCount" : 0, "flushedCellsSize" : 57768, "compactedCellsSize" : 0, "majorCompactedCellsSize" : 0, "blockedRequestCount" : 0, "splitSuccessCount" : 1, "splitRequestCount" : 1, "Append_num_ops" : 0, "Append_min" : 0, "Append_max" : 0, "Append_mean" : 0.0, "Append_median" : 0.0, "Append_75th_percentile" : 0.0, "Append_95th_percentile" : 0.0, "Append_99th_percentile" : 0.0, "Delete_num_ops" : 0, "Delete_min" : 0, "Delete_max" : 0, "Delete_mean" : 0.0, "Delete_median" : 0.0, "Delete_75th_percentile" : 0.0, "Delete_95th_percentile" : 0.0, "Delete_99th_percentile" : 0.0, "Mutate_num_ops" : 583, "Mutate_min" : 1, "Mutate_max" : 412, "Mutate_mean" : 61.06003430531732, "Mutate_median" : 79.0, "Mutate_75th_percentile" : 84.0, "Mutate_95th_percentile" : 110.0, "Mutate_99th_percentile" : 126.6, "ScanNext_num_ops" : 21, "ScanNext_min" : 0, "ScanNext_max" : 31287, "ScanNext_mean" : 16041.0, "ScanNext_median" : 9034.0, "ScanNext_75th_percentile" : 12580.0, "ScanNext_95th_percentile" : 12680.0, "ScanNext_99th_percentile" : 12680.0, "slowDeleteCount" : 0, "slowIncrementCount" : 0, "FlushTime_num_ops" : 3, "FlushTime_min" : 1239, "FlushTime_max" : 21322, "FlushTime_mean" : 13931.333333333334, "FlushTime_median" : 19233.0, "FlushTime_75th_percentile" : 21322.0, "FlushTime_95th_percentile" : 21322.0, "FlushTime_99th_percentile" : 21322.0, "Get_num_ops" : 5372, "Get_min" : 0, "Get_max" : 47, "Get_mean" : 0.22412509307520476, "Get_median" : 0.0, "Get_75th_percentile" : 0.0, "Get_95th_percentile" : 1.0, "Get_99th_percentile" : 1.0, "Replay_num_ops" : 0, "Replay_min" : 0, "Replay_max" : 0, "Replay_mean" : 0.0, "Replay_median" : 0.0, "Replay_75th_percentile" : 0.0, "Replay_95th_percentile" : 0.0, "Replay_99th_percentile" : 0.0, "slowGetCount" : 0, "slowAppendCount" : 0, "slowPutCount" : 0, "SplitTime_num_ops" : 1, "SplitTime_min" : 4334, "SplitTime_max" : 4334, "SplitTime_mean" : 4334.0, "SplitTime_median" : 4334.0, "SplitTime_75th_percentile" : 4334.0, "SplitTime_95th_percentile" : 4334.0, "SplitTime_99th_percentile" : 4334.0, "Increment_num_ops" : 0, "Increment_min" : 0, "Increment_max" : 0, "Increment_mean" : 0.0, "Increment_median" : 0.0, "Increment_75th_percentile" : 0.0, "Increment_95th_percentile" : 0.0, "Increment_99th_percentile" : 0.0 } ] } 2016-10-07 03:01:03,638 INFO [main-EventThread] regionserver.HRegionServer: STOPPED: regionserver:16020-0x3572d7e3c28020d, quorum=imp1tvhdpmst2.corp.test.com:2181,imp1tvhdpmst3.corp.test.com:2181,imp1tvhdpmst4.corp.test.com:2181, baseZNode=/hbase-secure regionserver:16020-0x3572d7e3c28020d received expired from ZooKeeper, aborting 2016-10-07 03:01:03,638 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-10-07 03:01:03,638 WARN [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] util.Sleeper: We slept 62101ms instead of 3000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2016-10-07 03:01:03,639 INFO [org.apache.ranger.audit.queue.AuditBatchQueue0] provider.BaseAuditHandler: Audit Status Log: name=hbaseRegional.async.summary.multi_dest.batch, finalDestination=hbaseRegional.async.summary.multi_dest.batch.hdfs, interval=01:03.866 minutes, events=1, totalEvents=585, totalSuccessCount=250 2016-10-07 03:01:03,639 INFO [org.apache.ranger.audit.queue.AuditBatchQueue1] provider.BaseAuditHandler: Audit Status Log: name=hbaseRegional.async.summary.multi_dest.batch, finalDestination=hbaseRegional.async.summary.multi_dest.batch.solr, interval=01:03.860 minutes, events=1, totalEvents=552, totalSuccessCount=250 2016-10-07 03:01:03,639 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.SplitLogWorker: Sending interrupt to stop the worker thread 2016-10-07 03:01:03,640 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.HRegionServer: Stopping infoServer 2016-10-07 03:01:03,640 INFO [SplitLogWorker-imp1tvhdpslv1:16020] regionserver.SplitLogWorker: SplitLogWorker interrupted. Exiting. 2016-10-07 03:01:03,640 INFO [org.apache.ranger.audit.queue.AuditBatchQueue1] provider.BaseAuditHandler: Audit Status Log: name=hbaseRegional.async.summary.multi_dest.batch.solr, interval=08:47:58.911 hours, events=2, succcessCount=2, totalEvents=250, totalSuccessCount=250 2016-10-07 03:01:03,640 INFO [org.apache.ranger.audit.queue.AuditBatchQueue0] provider.BaseAuditHandler: Audit Status Log: name=hbaseRegional.async.summary.multi_dest.batch.hdfs, interval=08:47:58.893 hours, events=2, succcessCount=2, totalEvents=250, totalSuccessCount=250 2016-10-07 03:01:03,640 INFO [SplitLogWorker-imp1tvhdpslv1:16020] regionserver.SplitLogWorker: SplitLogWorker imp1tvhdpslv1.corp.test.com,16020,1475152617424 exiting 2016-10-07 03:01:03,642 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:16030 2016-10-07 03:01:03,743 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.HeapMemoryManager: Stoping HeapMemoryTuner chore. 2016-10-07 03:01:03,744 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] snapshot.RegionServerSnapshotManager: Stopping RegionServerSnapshotManager abruptly. 2016-10-07 03:01:03,744 INFO [MemStoreFlusher.1] regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting 2016-10-07 03:01:03,744 INFO [MemStoreFlusher.0] regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting 2016-10-07 03:01:03,744 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] flush.RegionServerFlushTableProcedureManager: Stopping region server flush procedure manager abruptly. 2016-10-07 03:01:03,749 INFO [StoreCloserThread-ambarismoketest,,1466187475382.7ee7f2de55e70e37f2fb23624f9e593e.-1] regionserver.HStore: Closed family 2016-10-07 03:01:03,750 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.HRegionServer: aborting server imp1tvhdpslv1.corp.test.com,16020,1475152617424 2016-10-07 03:01:03,750 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-2] regionserver.HRegion: Closed ambarismoketest,,1466187475382.7ee7f2de55e70e37f2fb23624f9e593e. 2016-10-07 03:01:03,750 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2572d7e364901d2 2016-10-07 03:01:03,750 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020-SendThread(imp1tvhdpmst4.corp.test.com:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2016-10-07 03:01:03,752 INFO [StoreCloserThread-du_secure_vault_tst_jcb,,1475653102504.d6e8b646cabc9a73fee39da3f71afcc9.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,752 INFO [StoreCloserThread-du_secure_vault_onedaydata_dq,contract_main_resource_num_token_99445B2,1473230358405.57a88265f314ba4a9ff713deef241c37.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,753 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server imp1tvhdpmst4.corp.test.com/172.24.125.133:2181. Will attempt to SASL-authenticate using Login Context section 'Client' 2016-10-07 03:01:03,753 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-1] regionserver.HRegion: Closed du_secure_vault_tst_jcb,,1475653102504.d6e8b646cabc9a73fee39da3f71afcc9. 2016-10-07 03:01:03,753 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020-SendThread(imp1tvhdpmst4.corp.test.com:2181)] zookeeper.ClientCnxn: Socket connection established to imp1tvhdpmst4.corp.test.com/172.24.125.133:2181, initiating session 2016-10-07 03:01:03,755 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-0] regionserver.HRegion: Closed du_secure_vault_onedaydata_dq,contract_main_resource_num_token_99445B2,1473230358405.57a88265f314ba4a9ff713deef241c37. 2016-10-07 03:01:03,755 INFO [StoreCloserThread-sun_test,,1464509020983.12c9c4b8684478c63917f0253bf930b8.-1] regionserver.HStore: Closed cf 2016-10-07 03:01:03,755 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-2] regionserver.HRegion: Closed sun_test,,1464509020983.12c9c4b8684478c63917f0253bf930b8. 2016-10-07 03:01:03,757 INFO [StoreCloserThread-du_secure_vault_tst_may,,1475763178137.0d211090290d0cac9aceb17d29c2e05a.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,757 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-0] regionserver.HRegion: Closed du_secure_vault_tst_may,,1475763178137.0d211090290d0cac9aceb17d29c2e05a. 2016-10-07 03:01:03,758 INFO [StoreCloserThread-du_hbase_krb_tst,,1465128507241.11f2c36247ad6f3348ca4013eb99daad.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,758 INFO [StoreCloserThread-du_secure_vault_onedaydata_dq,msisdn_raw_key_97152680792,1472729475904.00d916acbfa9044e067b9b6b9ec0e601.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,759 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-1] regionserver.HRegion: Closed du_hbase_krb_tst,,1465128507241.11f2c36247ad6f3348ca4013eb99daad. 2016-10-07 03:01:03,759 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-2] regionserver.HRegion: Closed du_secure_vault_onedaydata_dq,msisdn_raw_key_97152680792,1472729475904.00d916acbfa9044e067b9b6b9ec0e601. 2016-10-07 03:01:03,762 INFO [StoreCloserThread-du_secure_vault_onedaydata,callingnormalized_key_97155790835,1470233185954.293f370188569fa959fe704c6eca64e8.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,762 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-0] regionserver.HRegion: Closed du_secure_vault_onedaydata,callingnormalized_key_97155790835,1470233185954.293f370188569fa959fe704c6eca64e8. 2016-10-07 03:01:03,763 INFO [StoreCloserThread-du_secure_vault_onedaydata,contract_storage_medium_num_token_849867732,1470235487700.89859097282ea3fb2868bc90ded9653a.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,764 INFO [StoreCloserThread-du_secure_vault_onedaydata,imsi_key_42403021011613,1468488066776.0c2171e0aaba28588914a58f5f70152e.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,764 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-1] regionserver.HRegion: Closed du_secure_vault_onedaydata,contract_storage_medium_num_token_849867732,1470235487700.89859097282ea3fb2868bc90ded9653a. 2016-10-07 03:01:03,765 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-2] regionserver.HRegion: Closed du_secure_vault_onedaydata,imsi_key_42403021011613,1468488066776.0c2171e0aaba28588914a58f5f70152e. 2016-10-07 03:01:03,767 INFO [StoreCloserThread-du_secure_vault_onedaydata_dq,cust_id_token_5568245463,1474068738847.8110480ef3807a8e15f7508a979f6fe8.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,767 INFO [StoreCloserThread-du_secure_vault,,1465883027720.d506c1c78844ea16909341ca618e18e3.-1] regionserver.HStore: Closed vault 2016-10-07 03:01:03,767 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-0] regionserver.HRegion: Closed du_secure_vault_onedaydata_dq,cust_id_token_5568245463,1474068738847.8110480ef3807a8e15f7508a979f6fe8. 2016-10-07 03:01:03,768 INFO [RS_CLOSE_REGION-imp1tvhdpslv1:16020-1] regionserver.HRegion: Closed du_secure_vault,,1465883027720.d506c1c78844ea16909341ca618e18e3. 2016-10-07 03:01:03,857 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] zookeeper.ZooKeeper: Session: 0x2572d7e364901d2 closed 2016-10-07 03:01:03,857 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-10-07 03:01:03,857 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.HRegionServer: stopping server imp1tvhdpslv1.corp.test.com,16020,1475152617424; all regions closed. 2016-10-07 03:01:03,895 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.Leases: regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020 closing leases 2016-10-07 03:01:03,895 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.Leases: regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020 closed leases 2016-10-07 03:01:03,896 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] hbase.ChoreService: Chore service for: imp1tvhdpslv1.corp.test.com,16020,1475152617424 had [[ScheduledChore: Name: MovedRegionsCleaner for region imp1tvhdpslv1.corp.test.com,16020,1475152617424 Period: 120000 Unit: MILLISECONDS], [ScheduledChore: Name: imp1tvhdpslv1.corp.test.com,16020,1475152617424-MemstoreFlusherChore Period: 10000 Unit: MILLISECONDS]] on shutdown 2016-10-07 03:01:08,678 INFO [RS_OPEN_META-imp1tvhdpslv1:16020-0-MetaLogRoller] regionserver.LogRoller: LogRoller exiting. 2016-10-07 03:01:08,679 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020.logRoller] regionserver.LogRoller: LogRoller exiting. 2016-10-07 03:01:08,680 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.CompactSplitThread: Waiting for Split Thread to finish... 2016-10-07 03:01:08,680 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.CompactSplitThread: Waiting for Merge Thread to finish... 2016-10-07 03:01:08,680 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] regionserver.CompactSplitThread: Waiting for Large Compaction Thread to finish... 2016-10-07 03:01:08,680 INFO [regionserver/imp1tvhdpslv1.corp.test.com/172.24.125.134:16020] region
Created 10-07-2016 08:56 PM
The regionserver is failing because of this:
Please check your GC settings and tune the GC.
Created 10-07-2016 08:56 PM
The regionserver is failing because of this:
Please check your GC settings and tune the GC.