<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: HDFS Checkpoint problem in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18200#M2800</link>
    <description>Do your name nodes and jobtracker run on the same host?&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Thu, 04 Sep 2014 12:48:35 GMT</pubDate>
    <dc:creator>GautamG</dc:creator>
    <dc:date>2014-09-04T12:48:35Z</dc:date>
    <item>
      <title>HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18128#M2797</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;our Mapred Job suddenly pause during HDFS Checkpoint occured and get back to normal after HDFS checkpoint done. Is there any relation stand by Namenode activity with Active JobTracker in term of Mapred activity? we've been looking for any file logs to get clues but still useless.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For additional information our cluster using CDH4U3 with HA Namenode and Jobtracker.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;regards,&lt;BR /&gt;-i9um0-&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:06:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18128#M2797</guid>
      <dc:creator>i9um0</dc:creator>
      <dc:date>2022-09-16T09:06:42Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18194#M2798</link>
      <description>How long does this pause normally last? If you are able to, capture&lt;BR /&gt;3-5 jstack of the jobtracker spaced a few seconds apart and upload it here&lt;BR /&gt;(pastebin or gist)&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 04 Sep 2014 11:43:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18194#M2798</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2014-09-04T11:43:35Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18198#M2799</link>
      <description>&lt;P&gt;Hi Gautam,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;roughly about 10-15 minutes, last time we change configuration of&amp;nbsp;&lt;SPAN&gt;dfs.image.compress from false to true cause we have problem with FSImage size that growing fast. did it affect with Mapred process ? . our team will run jstack and inform you later on.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;regards,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;i9um0&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Sep 2014 12:33:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18198#M2799</guid>
      <dc:creator>i9um0</dc:creator>
      <dc:date>2014-09-04T12:33:02Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18200#M2800</link>
      <description>Do your name nodes and jobtracker run on the same host?&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 04 Sep 2014 12:48:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18200#M2800</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2014-09-04T12:48:35Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18282#M2801</link>
      <description>&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;Hi Gautam,&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;SPAN style="color: #000000;"&gt;we segregate those services in different host but we collocate Jobtracker with Journal node services in same host.&amp;nbsp; Following &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;are link our log in pastebin :&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;&amp;nbsp;-activeNN-jstack-1&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/zpjFwC4z"&gt;http://pastebin.com/zpjFwC4z&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="line-height: 13px;"&gt;&lt;SPAN&gt;&amp;nbsp;-activeNN-jstack-2&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;A target="_blank" href="http://pastebin.com/WGSeep06"&gt;http://pastebin.com/WGSeep06&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeNN-jstack-3&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/0Nuj5zpW"&gt;http://pastebin.com/0Nuj5zpW&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeNN-jstack-4&lt;/SPAN&gt;&amp;nbsp;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/xGj8ZpJa"&gt;http://pastebin.com/xGj8ZpJa&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeNN-jstack-5&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/jCk8BVsS"&gt;http://pastebin.com/jCk8BVsS&lt;/A&gt;&amp;nbsp;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-1a&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/Sx4iZDMD"&gt;http://pastebin.com/Sx4iZDMD&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;standbyNN-jstack-1b&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/WLa4E97N"&gt;http://pastebin.com/WLa4E97N&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-2a&lt;/SPAN&gt;&amp;nbsp;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/KYnBfntZ"&gt;http://pastebin.com/KYnBfntZ&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-2b&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/JMxatCzU"&gt;http://pastebin.com/JMxatCzU&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-3a&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/vb4N3AX2"&gt;http://pastebin.com/vb4N3AX2&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-3b&lt;/SPAN&gt;&amp;nbsp;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/UFTkjMGs"&gt;http://pastebin.com/UFTkjMGs&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-4a&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/xMsPKAfD"&gt;http://pastebin.com/xMsPKAfD&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-4b&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/7ReiGmYa"&gt;http://pastebin.com/7ReiGmYa&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-5a&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/SMmyvbkQ"&gt;http://pastebin.com/SMmyvbkQ&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-standbyNN-jstack-5b&lt;/SPAN&gt;&amp;nbsp;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/cS7Q2p2T"&gt;http://pastebin.com/cS7Q2p2T&lt;/A&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeJT-jstack-1a&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;A target="_blank" href="http://pastebin.com/qVYQGh7x"&gt;http://pastebin.com/qVYQGh7x&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeJT-jstack-1b&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/ifr2EfYW"&gt;http://pastebin.com/ifr2EfYW&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeJT-jstack-2a&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/Jbj6YGda"&gt;http://pastebin.com/Jbj6YGda&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeJT-jstack-2b&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/2bqKG8pE"&gt;http://pastebin.com/2bqKG8pE&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeJT-jstack-3a&lt;/SPAN&gt;&amp;nbsp;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/FLznnuuj"&gt;http://pastebin.com/FLznnuuj&lt;/A&gt;&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN&gt;-activeJT-jstack-3b&lt;/SPAN&gt;&lt;BR /&gt;&lt;A target="_blank" href="http://pastebin.com/5km2MbMC"&gt;http://pastebin.com/5km2MbMC&lt;/A&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;just your info i reply from email too beside post this message&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="line-height: 13px;"&gt;-i9um0-&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Sep 2014 13:01:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18282#M2801</guid>
      <dc:creator>i9um0</dc:creator>
      <dc:date>2014-09-05T13:01:35Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18390#M2802</link>
      <description>&lt;P&gt;&lt;SPAN style="font-family: helvetica;"&gt;We see a lot of these in the JobTracker jstack. So the namenode is responding.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: andale mono,times;"&gt;"DataStreamer for file /tmp/hadoop-hadoop-user/7418759843_pipe_1371547789813_7CC40A5EC84074F51068D326FE4B44CD/_logs/history/job_201409040312_85799_1409897033005_hadoop-user_%5B3529B6C5248F26FE0B927AADBA7BDA41%2F7E4BD3F9FCBCBE4B block BP-2096330913-10.250.195.101-1373872395153:blk_468657822786954548_993063000" daemon prio=10 tid=0x00007f1f2a96f000 nid=0x7b56 in Object.wait() [0x00007f1ebc9e7000]&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: andale mono,times;"&gt;&amp;nbsp;&amp;nbsp; java.lang.Thread.State: TIMED_WAITING (on object monitor)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: andale mono,times;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at java.lang.Object.wait(Native Method)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: andale mono,times;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:464)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: andale mono,times;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - locked &amp;lt;0x0000000625121b00&amp;gt; (a java.util.LinkedList)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: andale mono,times;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: helvetica;"&gt;Have you noticed a large spike in number of blocks and have you tuned your NN heap to deal with this rise? Did the JT pause only began when you turned on compression of fsimage?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2014 12:11:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18390#M2802</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2014-09-08T12:11:07Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Checkpoint problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18406#M2803</link>
      <description>&lt;P&gt;Hi Gautam,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Thank's for quick response. As i told you before we have problem with HDFS block, currently our HDFS block reach to 58 millions with storage occupied equal to 500TB it's not quite ideal. NN memory capacity is about 64GB and we &amp;nbsp;set 32GB&amp;nbsp; for NN heap. Last time we change dfs.namenode.checkpoint.txns at the same time with dfs.image.compress from default 44K to 1000K because, we thought when the system often do checkpoint thats lead namenode service become bad as CM report through email.&lt;BR /&gt;&lt;BR /&gt;About your question JT pause only began when you turned on compression of fsimage?. We not sure about that cause the mapreduce pause never like now until 10 minutes, &amp;nbsp;whether it happens or not we do not notice.&lt;BR /&gt;&lt;BR /&gt;Does only increase NN heap memory or there are other alternatives that we can tune related with hadoop parameters will reduce load HDFS and will bring back mapreduce to normal during checkpoint ?&lt;BR /&gt;&lt;BR /&gt;regards,&lt;BR /&gt;i9um0&lt;/P&gt;&lt;DIV class="moz-signature"&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Mon, 08 Sep 2014 13:21:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Checkpoint-problem/m-p/18406#M2803</guid>
      <dc:creator>i9um0</dc:creator>
      <dc:date>2014-09-08T13:21:24Z</dc:date>
    </item>
  </channel>
</rss>

