<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Replace hardware on NodeManager server / yarn.nodemanager.recovery.enabled = false in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105579#M21570</link>
    <description>&lt;P&gt;Unlike replacing disk for DataNode, I can't find any information for replacing disk for NodeManager.&lt;/P&gt;&lt;P&gt;As HDP's yarn.nodemanager.recovery.enabled = true, my guess is if I stopped a NodeManager while some containers were running, jobs related to these containers would wait until this NodeManager was started, which may not be convenient as it would affect to SLA.&lt;/P&gt;&lt;P&gt;If this is true, is there any issue of setting yarn.nodemanager.recovery.enabled = &lt;STRONG&gt;false&lt;/STRONG&gt; permanently, so that when NodeManager is stopped, (my expectation is) the container will be created in another NodeManager?&lt;/P&gt;</description>
    <pubDate>Wed, 02 Mar 2016 16:54:44 GMT</pubDate>
    <dc:creator>hosako</dc:creator>
    <dc:date>2016-03-02T16:54:44Z</dc:date>
    <item>
      <title>Replace hardware on NodeManager server / yarn.nodemanager.recovery.enabled = false</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105579#M21570</link>
      <description>&lt;P&gt;Unlike replacing disk for DataNode, I can't find any information for replacing disk for NodeManager.&lt;/P&gt;&lt;P&gt;As HDP's yarn.nodemanager.recovery.enabled = true, my guess is if I stopped a NodeManager while some containers were running, jobs related to these containers would wait until this NodeManager was started, which may not be convenient as it would affect to SLA.&lt;/P&gt;&lt;P&gt;If this is true, is there any issue of setting yarn.nodemanager.recovery.enabled = &lt;STRONG&gt;false&lt;/STRONG&gt; permanently, so that when NodeManager is stopped, (my expectation is) the container will be created in another NodeManager?&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 16:54:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105579#M21570</guid>
      <dc:creator>hosako</dc:creator>
      <dc:date>2016-03-02T16:54:44Z</dc:date>
    </item>
    <item>
      <title>Re: Replace hardware on NodeManager server / yarn.nodemanager.recovery.enabled = false</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105580#M21571</link>
      <description>&lt;P&gt;See this &lt;A href="https://issues.apache.org/jira/browse/YARN-1336" target="_blank"&gt;https://issues.apache.org/jira/browse/YARN-1336&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/206/hosako.html" nodeid="206"&gt;@Hajime&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I don't recommend setting this to false. This setting helps significantly in the event that NodeManager fails for various reasons. &lt;/P&gt;&lt;P&gt;Make sure that yarn.nodemanager.recover.dir  points to non temp directory&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 17:21:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105580#M21571</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-03-02T17:21:42Z</dc:date>
    </item>
    <item>
      <title>Re: Replace hardware on NodeManager server / yarn.nodemanager.recovery.enabled = false</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105581#M21572</link>
      <description>&lt;P&gt;The Nodemanager restart does not recreate the containers. It reattaches to existing containers that are still running. I.e. when a nodemanager is restarted, the server may not have been rebooted but just the nodemanager process. Instead of shooting down all containers and starting fresh he can reattach to the still running containers and therefore has less impact on running applications.&lt;/P&gt;&lt;P&gt;Especially good for long running applications like Spark Streaming and Application Masters. So SLAs shouldn't be affected. If for example the whole node goes down. MapReduce and Tez will still see dead containers and application masters and recreate as necessary. yarn recovery has no impact on that.&lt;/P&gt;&lt;P&gt;&lt;A href="http://hortonworks.com/blog/resilience-of-yarn-applications-across-nodemanager-restarts/" target="_blank"&gt;http://hortonworks.com/blog/resilience-of-yarn-applications-across-nodemanager-restarts/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 17:25:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105581#M21572</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-03-02T17:25:54Z</dc:date>
    </item>
    <item>
      <title>Re: Replace hardware on NodeManager server / yarn.nodemanager.recovery.enabled = false</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105582#M21573</link>
      <description>&lt;P&gt;So.. for hardware replacement, don't need to worry about recovery? Just shutdown OS?&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 17:29:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105582#M21573</guid>
      <dc:creator>hosako</dc:creator>
      <dc:date>2016-03-02T17:29:12Z</dc:date>
    </item>
    <item>
      <title>Re: Replace hardware on NodeManager server / yarn.nodemanager.recovery.enabled = false</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105583#M21574</link>
      <description>&lt;P&gt;If you shutdown the OS all tasks running on that node will be stopped too so you don't need to worry about recovery. You might kill the running application masters on that node though. There is no graceful shutdown of a nodemanager that waits for running applications to finish as of yet ( AFAIK if someone knows better let me know ). Yarn depends on applications to handle task or AM failures gracefully.&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/YARN-914" target="_blank"&gt;https://issues.apache.org/jira/browse/YARN-914&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 17:37:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Replace-hardware-on-NodeManager-server-yarn-nodemanager/m-p/105583#M21574</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-03-02T17:37:34Z</dc:date>
    </item>
  </channel>
</rss>

