<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NodeManager receives SIGKILL on CDH 5.4.1 in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/43545#M34724</link>
    <description>&lt;P&gt;&lt;STRONG&gt;Update:&lt;/STRONG&gt; I was finally able to reproduce on a non-production cluster where I could enable heapdump on OOM. I found that NodeManager had some very large Strings containing the stdout/stderr of the applications it was running. The fix is to redirect stdout/stderr to /dev/null in our ContainerLaunchContext so the streams are not picked-up by NodeManager at all.&lt;/P&gt;</description>
    <pubDate>Thu, 04 Aug 2016 15:16:31 GMT</pubDate>
    <dc:creator>therealmfigura</dc:creator>
    <dc:date>2016-08-04T15:16:31Z</dc:date>
    <item>
      <title>NodeManager receives SIGKILL on CDH 5.4.1</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/42860#M34722</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am running a custom Hadoop/YARN application on a 20 node CDH 5.4.1 cluster. Every node runs NodeManager. Once in a while, some of the NodeManagers spontaneously restart. This shows up as an unexpected exit alert in Cloudera Manager.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Nothing appears in the NodeManager logs (/var/log/hadoop-yarn/) before the startup message&lt;/P&gt;&lt;P&gt;&lt;EM&gt;/var/log/cloudera-scm-agent/cloudera-scm-agent.log&lt;/EM&gt; notes the unexpected exit, but no other information&lt;/P&gt;&lt;P&gt;&lt;EM&gt;/var/log/cloudera-scm-agent/supervisord.log&lt;/EM&gt; notes NodeManager exited due to&amp;nbsp;SIGKILL&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there another Cloudera (or Hadoop) component that might be sending the SIGKILL besides the Cloudera agent?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Usually a group of about 5 NodeManagers restart at once. Then, no restarts for hours or days. It's not always the same nodes.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for any help!&lt;/P&gt;&lt;P&gt;Mark&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:30:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/42860#M34722</guid>
      <dc:creator>therealmfigura</dc:creator>
      <dc:date>2022-09-16T10:30:04Z</dc:date>
    </item>
    <item>
      <title>Re: NodeManager receives SIGKILL on CDH 5.4.1</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/43175#M34723</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Update&lt;/STRONG&gt;: I've found NodeManager is being killed due to OutOfMemoryException by Cloudera's killparent.sh script. I found this by modifying killparent.sh to log a message before killing NodeManager.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We've increased the -Xmx setting for NodeManager from 1GB to 2GB and it's still happening, though less often. It's unclear why this is happening since the JVM memory usage reported through Cloudera Manager doesn't seem to be especially close to the maximum.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I suppose the next step is to enable heapdump on OOM, though this may be difficult on this production cluster...&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jul 2016 16:20:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/43175#M34723</guid>
      <dc:creator>therealmfigura</dc:creator>
      <dc:date>2016-07-25T16:20:02Z</dc:date>
    </item>
    <item>
      <title>Re: NodeManager receives SIGKILL on CDH 5.4.1</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/43545#M34724</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Update:&lt;/STRONG&gt; I was finally able to reproduce on a non-production cluster where I could enable heapdump on OOM. I found that NodeManager had some very large Strings containing the stdout/stderr of the applications it was running. The fix is to redirect stdout/stderr to /dev/null in our ContainerLaunchContext so the streams are not picked-up by NodeManager at all.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2016 15:16:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NodeManager-receives-SIGKILL-on-CDH-5-4-1/m-p/43545#M34724</guid>
      <dc:creator>therealmfigura</dc:creator>
      <dc:date>2016-08-04T15:16:31Z</dc:date>
    </item>
  </channel>
</rss>

