<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Intermittently one of the journal nodes get out of Sync in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51207#M56436</link>
    <description>That is probably the source in the spike in edits being written to the JNs. You could try to address it so reduce the impact.</description>
    <pubDate>Tue, 21 Feb 2017 03:40:05 GMT</pubDate>
    <dc:creator>mbigelow</dc:creator>
    <dc:date>2017-02-21T03:40:05Z</dc:date>
    <item>
      <title>Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51110#M56428</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have 3 JNs, 2 on physical servers and the 3rd on virtual server with 6 Vcores.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Recently from time to time the vm server get out of sync for few seconds, I checked the vm resources and parmeters and nothing looks out of the rodinary, what is see in Cloudera manager metrics that the journal write bytes sometime are higher than different times&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;here what i see:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The active NameNode was out of sync with this JournalNode.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;===============&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't write txid 1659311573 expecting nextTxId=1659311555
	at org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:485)
	at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:371)
	at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:149)
	at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
	at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 11:06:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51110#M56428</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2022-09-16T11:06:46Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51127#M56429</link>
      <description>&lt;P&gt;Even though the VM looks fine it is probably a resource contraint on the VM that is causing this issue. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The Namenode writes each edit to its own local directory and all of the JN edits directories. &amp;nbsp;It simply sounds like the VM isn't keeping up or getting the job done in time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Examine the contents of the JN edits directory on each and you will find that the VM does on contain all of the necessary edits. &amp;nbsp;You can manually copy the edits_* files to the VM nodes to get it back in sync and see if it happens again. &amp;nbsp;I do recommend using the same hardware for all three Master nodes that would run each JN and ZK instance. &amp;nbsp;Otherwise, you will often be found on just barely maintaining the quorum to stay running.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;dfs.namenode.shared.edits.dir&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;dfs.journalnode.edits.dir&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 18 Feb 2017 18:08:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51127#M56429</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-18T18:08:22Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51128#M56430</link>
      <description>Indeed it's happening for few seconds and then the Vm get Sync, it happened&lt;BR /&gt;from time to time so sometimes i suspect that one job or hive query that&lt;BR /&gt;writes alot of blocks and files that may cause the issue.&lt;BR /&gt;&lt;BR /&gt;Do you think i should examine this again? should i check the content of the&lt;BR /&gt;file itself? do you think if migrate the JN role from the vm to a stronger&lt;BR /&gt;node with 12 vcores can solve the issue?&lt;BR /&gt;</description>
      <pubDate>Sat, 18 Feb 2017 18:17:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51128#M56430</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2017-02-18T18:17:17Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51129#M56431</link>
      <description>I do think that you need to move the JN to the same/similar hardware to what you have the others on.&lt;BR /&gt;&lt;BR /&gt;You don't need to check the contents or the files itself. Since it is happening every few seconds it is just lagging behind and then catching up. So if you want to run any real loads on the cluster it needs to be moved to better hardware.</description>
      <pubDate>Sat, 18 Feb 2017 18:21:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51129#M56431</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-18T18:21:25Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51131#M56432</link>
      <description>&lt;P&gt;Is it familair to add JN on DataNode/NodeManager server?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my cluster, the 2NNs are physical, the CM and the application server that hosts mysql and oozie are VMs servers, all other DataNodes are physical ones.&lt;/P&gt;</description>
      <pubDate>Sat, 18 Feb 2017 18:25:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51131#M56432</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2017-02-18T18:25:59Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51132#M56433</link>
      <description>No, typically Worker nodes are just the process that do the work, Datanode, Impala daemon, NodeManager.&lt;BR /&gt;&lt;BR /&gt;In theory you could and have it on the OS disk (not on any HDFS disks) but you will eventually run into contention between the OS, logs, and the edits. But if you have a small cluster.&lt;BR /&gt;&lt;BR /&gt;My minimum, for a production cluster and/or HA, is three large, physical servers for the Master.&lt;BR /&gt;&lt;BR /&gt;The DBs (although I prefer to have the HMS DB on the Master nodes as well), gateway roles, CM can all be on VMs.&lt;BR /&gt;&lt;BR /&gt;Where is your third ZK instance? As that one will also have IO contention issues on a VM or on a Datanode.</description>
      <pubDate>Sat, 18 Feb 2017 18:33:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51132#M56433</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-18T18:33:17Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51134#M56434</link>
      <description>&lt;P&gt;My 3rd ZK was on the same VM but after i got to this issue i moved the ZK to &amp;nbsp;another OpenStack servers and moved the spark history server one of the NNs to to reduce the load from the VM and increased the Vcores for the Vm to 6 cores but still have the same issue.&lt;/P&gt;</description>
      <pubDate>Sat, 18 Feb 2017 18:42:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51134#M56434</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2017-02-18T18:42:46Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51204#M56435</link>
      <description>&lt;P&gt;The intersting thing that i noticed when this happened at the same time some jobs that runs once a day write to HDFS relatively too much data and it's run with a good number of reducers betweeb 400-1100, which make me suspect in the blocks that written by these jobs at the same time and the vm is getting some lag, trying to find a way to approve this.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2017 02:46:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51204#M56435</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2017-02-21T02:46:11Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51207#M56436</link>
      <description>That is probably the source in the spike in edits being written to the JNs. You could try to address it so reduce the impact.</description>
      <pubDate>Tue, 21 Feb 2017 03:40:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51207#M56436</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-21T03:40:05Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51208#M56437</link>
      <description>&lt;P&gt;Do you think looking at the edit logs size when this occur should be a good indication?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2017 03:42:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51208#M56437</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2017-02-21T03:42:34Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51211#M56438</link>
      <description>If I recall correctly, edits logs will just be filled up to a certain size and then move to the next.&lt;BR /&gt;&lt;BR /&gt;In CM, the Namenode metrics for Transaction, Edit Log Syncs and Average Edit Log Sync Time would be better.&lt;BR /&gt;&lt;BR /&gt;Not sure if these are exposed by default.</description>
      <pubDate>Tue, 21 Feb 2017 07:08:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51211#M56438</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-21T07:08:04Z</dc:date>
    </item>
    <item>
      <title>Re: Intermittently one of the journal nodes get out of Sync</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51807#M56439</link>
      <description>&lt;P&gt;When i checked the job/the query that occur prior to the alert on the JN, i found one hive query that runs on a data of 6 months and recreate the hive table from new, which resulted in a good percentage of edit logs, i contacted the query owner and he reduced the&amp;nbsp;his running window from 6 months to 2 months which solve for us the issue.&lt;/P&gt;</description>
      <pubDate>Sun, 05 Mar 2017 22:26:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Intermittently-one-of-the-journal-nodes-get-out-of-Sync/m-p/51807#M56439</guid>
      <dc:creator>Fawze</dc:creator>
      <dc:date>2017-03-05T22:26:39Z</dc:date>
    </item>
  </channel>
</rss>

