<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: unable to start nodemanager in CDH 5.16.2 in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/unable-to-start-nodemanager-in-CDH-5-16-2/m-p/300154#M220061</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/22673"&gt;@Nagamalleswara&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thank you for sharing with us the detailed description of how your NM seems to have issues on a certain node.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Next time you hit this problem, can you please check if you see multiple NM instances on a single node?&lt;/P&gt;&lt;P&gt;In some cases the CM agent might gets into a bad state.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When you hit the issue, please run the following command and see if it showes 2 NM processes running on the host&lt;/P&gt;&lt;P&gt;ssh into the host ;&amp;nbsp;&amp;nbsp; become root&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;# export SUPER_CONF=/var/run/cloudera-scm-agent/supervisor/supervisord.conf&lt;/P&gt;&lt;P&gt;# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c $SUPER_CONF status&lt;/P&gt;&lt;P&gt;52372-yarn-NODEMANAGER&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; RUNNING&amp;nbsp;&amp;nbsp;&amp;nbsp; pid 71771, uptime 2:21:19&lt;/P&gt;&lt;P&gt;52723-yarn-NODEMANAGER&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; FATAL&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Exited too quickly (process&lt;/P&gt;&lt;P&gt;log may have details)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If so, it means that the CM-agent process was bad.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In this case please:&lt;/P&gt;&lt;P&gt;1) Proceed to kill the PID. Verify that the PID does not exist and neither is anything else listening on the port.&lt;/P&gt;&lt;P&gt;2) &lt;A href="https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_ag_agents.html" target="_self"&gt;Restart cloudera-scm-agent&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;service cloudera-scm-agent restart&lt;/P&gt;&lt;P&gt;3) Start the RM process from CM&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please share with us your findings. If you are not hitting the above described issue it still helps us to narrow down further this issue.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please also check if the NM heap size on this node is set to at least 1Gb just to rule out stop the world GC activity rendering the NM unresponsive.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you:&lt;BR /&gt;Ferenc&lt;/P&gt;</description>
    <pubDate>Tue, 21 Jul 2020 15:33:09 GMT</pubDate>
    <dc:creator>Bender</dc:creator>
    <dc:date>2020-07-21T15:33:09Z</dc:date>
  </channel>
</rss>

