<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Zookeeper slow fsync followed by CancelledKeyException. in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/67774#M8636</link>
    <description>&lt;P&gt;please let us know how it resolved. We are even facing the same exception in our Zookeeper.&lt;/P&gt;</description>
    <pubDate>Thu, 31 May 2018 08:44:36 GMT</pubDate>
    <dc:creator>rk1991</dc:creator>
    <dc:date>2018-05-31T08:44:36Z</dc:date>
    <item>
      <title>Zookeeper slow fsync followed by CancelledKeyException.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/32915#M8633</link>
      <description>&lt;P&gt;I am trying to enable HA for Resource Mgr as well NameNode. However, very often the masters failover to standby.&amp;nbsp;There is no issue with HA as such, but every failover ends up&amp;nbsp;exhausting one application attempt. I notice following issues:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A series of slow fsync followed (sometimes only) by CancelledKeyException.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;2015-10-12 17:22:41,000 - WARN [SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 6943ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2015-10-12 17:22:41,001 - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x1505bcdb3e3054e

2015-10-12 17:22:41,002 - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x1505bcdb3e30003 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null Error:KeeperErrorCode = Session moved
2015-10-12 17:22:41,004 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.65.144.35:36030 which had sessionid 0x1505bcdb3e30003
2015-10-12 17:22:41,006 - ERROR [CommitProcessor:3:NIOServerCnxn@178] - Unexpected Exception: 
java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:644)
at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)&lt;/PRE&gt;&lt;P&gt;The time taken is some time as high as 10sec. This could surely timeout the clients, I suppose leading to deletion of ephemeral nodes that masters created.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Around this time, the masters switch over. I have seen that disk space is not a concern. However, at times the&amp;nbsp;await time to ZK dataDir drive does show a surge.&lt;/P&gt;&lt;P&gt;I also confirmed that GC pauses are minimal.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any pointers would be really appreciated.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:43:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/32915#M8633</guid>
      <dc:creator>sumit.nigam</dc:creator>
      <dc:date>2022-09-16T09:43:45Z</dc:date>
    </item>
    <item>
      <title>Re: Zookeeper slow fsync followed by CancelledKeyException.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/50467#M8634</link>
      <description>&lt;P&gt;Hi sumit.nigam,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I see that I'm a bit late to the party, but I found your thread while looking for a solution to a problem that I have as well.&lt;/P&gt;&lt;P&gt;Are you hosting zookeepers on virtual machines, or on real hardware? Is zookeeper store in a dedicated disk?&lt;/P&gt;&lt;P&gt;Depending on the version you are running, there are guides from zookeeper that might help:&lt;/P&gt;&lt;P&gt;&lt;A href="https://zookeeper.apache.org/doc/trunk/zookeeperStarted.html" target="_blank"&gt;https://zookeeper.apache.org/doc/trunk/zookeeperStarted.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_maintenance" target="_blank"&gt;https://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_maintenance&lt;/A&gt;&lt;/P&gt;&lt;P&gt;For example, if your ZK cluster has been running for a while, maybe you need to clean up some of the logs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In either case, it always helps to know more details about the setup you are debugging. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope it helps (someone),&lt;/P&gt;&lt;P&gt;camypaj&lt;/P&gt;</description>
      <pubDate>Mon, 06 Feb 2017 12:09:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/50467#M8634</guid>
      <dc:creator>samurai</dc:creator>
      <dc:date>2017-02-06T12:09:08Z</dc:date>
    </item>
    <item>
      <title>Re: Zookeeper slow fsync followed by CancelledKeyException.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/50524#M8635</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/17543"&gt;@samurai&lt;/a&gt;&amp;nbsp;- Yes, there were 2 main issues. One, was that these were VMs and another was that zookeeper was collocated with another service which shared the same disk.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Feb 2017 03:33:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/50524#M8635</guid>
      <dc:creator>sumit.nigam</dc:creator>
      <dc:date>2017-02-07T03:33:40Z</dc:date>
    </item>
    <item>
      <title>Re: Zookeeper slow fsync followed by CancelledKeyException.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/67774#M8636</link>
      <description>&lt;P&gt;please let us know how it resolved. We are even facing the same exception in our Zookeeper.&lt;/P&gt;</description>
      <pubDate>Thu, 31 May 2018 08:44:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Zookeeper-slow-fsync-followed-by-CancelledKeyException/m-p/67774#M8636</guid>
      <dc:creator>rk1991</dc:creator>
      <dc:date>2018-05-31T08:44:36Z</dc:date>
    </item>
  </channel>
</rss>

