Member since
10-19-2017
21
Posts
4
Kudos Received
0
Solutions
04-02-2018
11:13 AM
Yarn Queue Manager is not opening.yarn-queue-manager.png
... View more
Labels:
- Labels:
-
Apache YARN
01-21-2018
07:11 AM
Active Hbase master goes down & and failover happened by making standby Hbase master as up.But after some time all the region servers goes down one by one and standby Hbase master also goes down and finally whole HBase cluster goes offline.We have started all the services make up. Active Hbase Master LOGS : - 2018-01-17 16:22:29,895 ERROR [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] master.ActiveMasterManager: master:16000-0x35ee3bbff600001, quorum=post-om2.vodafone.flytxt.com:2181,post-om1.vodafone.flytxt.com:2181,post-os1.vodafone.flytxt.com:2181, baseZNode=/hbase Error deleting our own master address node
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:745)
at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148)
at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267)
at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1145)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1071)
at java.lang.Thread.run(Thread.java:744)
2018-01-17 16:22:29,895 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15ee3bbf9910003
2018-01-17 16:22:29,897 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] zookeeper.ZooKeeper: Session: 0x15ee3bbf9910003 closed
2018-01-17 16:22:29,897 INFO [post-om2:16000.activeMasterManager-EventThread] zookeeper.ClientCnxn: EventThread shut down
2018-01-17 16:22:29,899 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] flush.MasterFlushTableProcedureManager: stop: server shutting down.
2018-01-17 16:22:29,899 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] ipc.RpcServer: Stopping server on 16000
2018-01-17 16:22:29,900 INFO [RpcServer.listener,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: stopping Active /old Standby HBase Master logs : - 2018-01-17 16:23:06,025 INFO [post-om1:16000.activeMasterManager] master.ActiveMasterManager: Registered Active Master=post-om1.vodafone.flytxt.com,16000,1507059524710
2018-01-17 16:24:46,275 INFO [post-om1:16000.activeMasterManager] master.AssignmentManager: Joined the cluster in 119ms, failover=true
Post-os10 goes down logs in the post-om1:
2018-01-17 18:45:35,364 ERROR [PriorityRpcServer.handler=11,queue=1,port=16000] master.MasterRpcServices: Region server post-os10.vodafone.flytxt.com,16020,1507059587301 reported a fatal error:
ABORTING region server post-os10.vodafone.flytxt.com,16020,1507059587301: IOE in log roller
Cause:
java.io.IOException: cannot get log writer
at org.apache.hadoop.hbase.wal.DefaultWALProvider.createWriter(DefaultWALProvider.java:365)
at org.apache.hadoop.hbase.regionserver.wal.FSHLog.createWriterInstance(FSHLog.java:746)
at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:711)
at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:137)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: Parent directory doesn't exist: /apps/hbase/data/WALs/post-os10.vodafone.flytxt.com,16020,1507059587301 2018-01-17 18:54:21,800 ERROR [PriorityRpcServer.handler=11,queue=1,port=16000] master.MasterRpcServices: Region server post-os5.vodafone.flytxt.com,16020,1513481333569 reported a fatal error:
ABORTING region server post-os5.vodafone.flytxt.com,16020,1513481333569: IOE in log roller
Cause:
java.io.IOException: cannot get log writer
Sample Region server logs : - 2018-01-17 18:52:10,311 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-SendThread(post-om1.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server post-om1.vodafone.flytxt.com/10.88.8.71:2181. Will not attempt to authenticate using SASL (unknown error)
2018-01-17 18:52:10,311 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-SendThread(post-om1.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Socket connection established to post-om1.vodafone.flytxt.com/10.88.8.71:2181, initiating session
2018-01-17 18:52:10,313 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-SendThread(post-om1.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x15ee3bbf9933b39 has expired, closing socket connection
2018-01-17 18:52:10,313 WARN [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-EventThread] client.ConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:606)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:517)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2018-01-17 18:52:10,313 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-EventThread] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15ee3bbf9933b39
2018-01-17 18:52:10,313 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-EventThread] zookeeper.ClientCnxn: EventThread shut down
2018-01-17 18:52:10,335 INFO [main-SendThread(post-om2.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server post-om2.vodafone.flytxt.com/10.88.8.79:2181. Will not attempt to authenticate using SASL (unknown error)
... View more
Labels:
- Labels:
-
Apache HBase