Contributor
Posts: 35
Registered: ‎07-27-2015
Accepted Solution
Zookeeper slow fsync followed by CancelledKeyException.

I am trying to enable HA for Resource Mgr as well NameNode. However, very often the masters failover to standby. There is no issue with HA as such, but every failover ends up exhausting one application attempt. I notice following issues:

 

A series of slow fsync followed (sometimes only) by CancelledKeyException.

 

 

2015-10-12 17:22:41,000 - WARN [SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 6943ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2015-10-12 17:22:41,001 - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x1505bcdb3e3054e

2015-10-12 17:22:41,002 - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x1505bcdb3e30003 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null Error:KeeperErrorCode = Session moved
2015-10-12 17:22:41,004 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.65.144.35:36030 which had sessionid 0x1505bcdb3e30003
2015-10-12 17:22:41,006 - ERROR [CommitProcessor:3:NIOServerCnxn@178] - Unexpected Exception: 
java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:644)
at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)

The time taken is some time as high as 10sec. This could surely timeout the clients, I suppose leading to deletion of ephemeral nodes that masters created.

 

Around this time, the masters switch over. I have seen that disk space is not a concern. However, at times the await time to ZK dataDir drive does show a surge.

I also confirmed that GC pauses are minimal.

 

Any pointers would be really appreciated. 

Who Me Too'd this topic