<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to identify stale datanode? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96341#M9732</link>
    <description>&lt;P&gt;Thanks Alex. Very good explanation. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;   I learned this the hard way yesterday night, by bringing the network down (ifdown eth1) while the datanode was up in one of VM nodes and refreshing the Namenode UI -&amp;gt; Datanode tab. &lt;A rel="user" href="https://community.cloudera.com/users/63/amiller.html" nodeid="63"&gt;@Alex Miller&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 31 Oct 2015 00:31:36 GMT</pubDate>
    <dc:creator>ayusuf</dc:creator>
    <dc:date>2015-10-31T00:31:36Z</dc:date>
    <item>
      <title>How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96336#M9727</link>
      <description>&lt;P&gt;Datanode Health Summary in Ambari Alerts reported 1 stale node. How to identify which datannode is in stale state?&lt;/P&gt;</description>
      <pubDate>Fri, 30 Oct 2015 07:38:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96336#M9727</guid>
      <dc:creator>ayusuf</dc:creator>
      <dc:date>2015-10-30T07:38:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96337#M9728</link>
      <description>&lt;P&gt;probably the namenode logs should say that.. &lt;/P&gt;</description>
      <pubDate>Fri, 30 Oct 2015 07:38:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96337#M9728</guid>
      <dc:creator>ssingla</dc:creator>
      <dc:date>2015-10-30T07:38:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96338#M9729</link>
      <description>&lt;P&gt;@ayusuf@hortonworks.com&lt;/P&gt;&lt;P&gt;This is good explanation ...namenode will know about the stale DN&lt;/P&gt;&lt;PRE&gt;dfs.namenode.stale.datanode.interval

Default time interval for marking a datanode as "stale", i.e., if the namenode has not received heartbeat msg from a datanode for more than this time interval, the datanode will be marked and treated as "stale" by default. The stale interval cannot be too small since otherwise this may cause too frequent change of stale states. We thus set a minimum stale interval value (the default value is 3 times of heartbeat interval) and guarantee that the stale interval cannot be less than the minimum value. A stale data node is avoided during lease/block recovery. It can be conditionally avoided for reads (see dfs.namenode.avoid.read.stale.datanode) and for writes (see dfs.namenode.avoid.write.stale.datanode).&lt;/PRE&gt;</description>
      <pubDate>Fri, 30 Oct 2015 08:00:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96338#M9729</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2015-10-30T08:00:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96339#M9730</link>
      <description>&lt;P&gt;A datanode is considered stale when:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;dfs.namenode.stale.datanode.interval &amp;lt; last contact &amp;lt; (2 * dfs.namenode.heartbeat.recheck-interval)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;In the &lt;A href="https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Web_Interface"&gt;NameNode UI&lt;/A&gt;&lt;STRONG&gt; Datanodes tab&lt;/STRONG&gt;, a stale datanode will stand out due to having a larger value for &lt;STRONG&gt;Last contact&lt;/STRONG&gt; among live datanodes (also available in JMX output). When a datanode is stale, it will be given lowest priority for reads and writes.&lt;/P&gt;&lt;P&gt;Using default values, the namenode will consider a datanode stale when its heartbeat is absent for 30 seconds. After &lt;EM&gt;another&lt;/EM&gt; 10 minutes without a heartbeat (10.5 minutes total), a datanode is considered dead.&lt;/P&gt;&lt;P&gt;Relevant &lt;A href="http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml"&gt;properties&lt;/A&gt; include:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;dfs.heartbeat.interval - default: 3 seconds
&lt;/LI&gt;&lt;LI&gt;dfs.namenode.stale.datanode.interval - default: 30 seconds
&lt;/LI&gt;&lt;LI&gt;dfs.namenode.heartbeat.recheck-interval - default: 5 minutes&lt;/LI&gt;&lt;LI&gt;dfs.namenode.avoid.read.stale.datanode - default: true
&lt;/LI&gt;&lt;LI&gt;dfs.namenode.avoid.write.stale.datanode - default: true&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;This feature was introduced by &lt;A href="https://issues.apache.org/jira/browse/HDFS-3703"&gt;HDFS-3703&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Fri, 30 Oct 2015 11:13:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96339#M9730</guid>
      <dc:creator>amiller</dc:creator>
      <dc:date>2015-10-30T11:13:18Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96340#M9731</link>
      <description>&lt;P&gt;Nicely explained! Thanks &lt;A rel="user" href="https://community.cloudera.com/users/63/amiller.html" nodeid="63"&gt;@Alex Miller&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Oct 2015 16:43:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96340#M9731</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2015-10-30T16:43:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96341#M9732</link>
      <description>&lt;P&gt;Thanks Alex. Very good explanation. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;   I learned this the hard way yesterday night, by bringing the network down (ifdown eth1) while the datanode was up in one of VM nodes and refreshing the Namenode UI -&amp;gt; Datanode tab. &lt;A rel="user" href="https://community.cloudera.com/users/63/amiller.html" nodeid="63"&gt;@Alex Miller&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 31 Oct 2015 00:31:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96341#M9732</guid>
      <dc:creator>ayusuf</dc:creator>
      <dc:date>2015-10-31T00:31:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96342#M9733</link>
      <description>&lt;P&gt;Thanks, hopefully it will save someone the hassle in the future.&lt;/P&gt;&lt;P&gt;In the future, please leave this as a comment rather than a separate answer.&lt;/P&gt;</description>
      <pubDate>Sat, 31 Oct 2015 01:22:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96342#M9733</guid>
      <dc:creator>amiller</dc:creator>
      <dc:date>2015-10-31T01:22:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96343#M9734</link>
      <description>&lt;P&gt;I Agree. Sorry, using AH for the first time and accidentally clicked reply instead of comment &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 31 Oct 2015 01:28:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96343#M9734</guid>
      <dc:creator>ayusuf</dc:creator>
      <dc:date>2015-10-31T01:28:35Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify stale datanode?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96344#M9735</link>
      <description>&lt;P&gt;No worries, we're all learning it as we go &lt;/P&gt;</description>
      <pubDate>Sun, 01 Nov 2015 07:54:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-identify-stale-datanode/m-p/96344#M9735</guid>
      <dc:creator>amiller</dc:creator>
      <dc:date>2015-11-01T07:54:30Z</dc:date>
    </item>
  </channel>
</rss>

