<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: ambari-agent cant start in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181532#M143737</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@uri ben-ari&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Yes, We can kill if other users are not using HiveServer2 (just to be sure that they are not running any job)&lt;/P&gt;&lt;PRE&gt;# cat /var/run/hive/hive-server.pid 
# ps -ef | grep `cat /var/run/hive/hive-server.pid`
# netstat -tnlpa | grep `cat /var/run/hve/hive-server.pid`
# kill -9 `cat /var/run/hive/hive-server.pid`&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;Above commands like cat &amp;amp; ps are to confirm if we are killing the correct process.&lt;/P&gt;</description>
    <pubDate>Sun, 05 Nov 2017 18:54:45 GMT</pubDate>
    <dc:creator>jsensharma</dc:creator>
    <dc:date>2017-11-05T18:54:45Z</dc:date>
    <item>
      <title>ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181524#M143729</link>
      <description>&lt;P&gt;from some unclear reason when we start the ambari agent on master machine its failed&lt;/P&gt;&lt;P&gt;from the log we can see that:&lt;/P&gt;&lt;P&gt;ERROR 2017-10-02 11:58:42,597 script_alert.py:123 - [Alert][hive_server_process] Failed with result CRITICAL: ['Connection failed on host machine-master01.pop.com:10000 (Traceback (most recent call last):\n  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_thrift_port.py", line 211, in execute\n 
ldap_password=ldap_password)\n  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/hive_check.py", line 79, in check_thrift_port_sasl\n    timeout=check_command_timeout)\n  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__\n    self.env.run()\n  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160,
in run\n &lt;/P&gt;&lt;P&gt;what cause this problem?&lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 16:21:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181524#M143729</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2017-11-05T16:21:19Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181525#M143730</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229" target="_blank"&gt;@uri ben-ari&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It looms like the  Hive Server Process is not running or due to some network issue the HiveServer2 host &amp;amp; ports are ot accessible from that agent machine and hence we see this alert:&lt;/P&gt;&lt;PRE&gt;[Alert][hive_server_process] Failed with result CRITICAL: ['Connection failed on host machine-master01.pop.com:10000 &lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;The Alert scheduler will keep triggering the alert in the default specified interval. So we can try the following:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1. Check if the hiveserver2 process is running and listening to port 10000, On the HiveServer2 please run the following commands to see if the port 10000 is listening and the hostname is correct .&lt;/P&gt;&lt;PRE&gt;# netstat -tnlpa | grep 10000
# service iptables top
# hostname -f&lt;/PRE&gt;&lt;P&gt;2. Also as we see the "ldap_password=ldap_password" string in the error stack trace so it might be due to issue with the LDAP authentication/LDAP as well. So checking the hiveserver2 log will also be helpful.&lt;/P&gt;&lt;P&gt;Also please kill the HiveServer2 process (If possible) and then try restarting it to see if it fixes the issue.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;3. From Agent machine please check if that host &amp;amp; port is accessible?&lt;/P&gt;&lt;PRE&gt;# nc -v machine-master01.pop.com 10000
(OR)
# telnet machine-master01.pop.com 10000&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;From Ambari Side we can try disabling the "HiveServer2 Process" alert temporarily to avoid seeing this alert.&lt;/P&gt;&lt;PRE&gt;Ambari UI --&amp;gt; "Alerts" (Tab) --&amp;gt; Search for "HiveServer2 Process" alert --&amp;gt; click on "Enabled" toggle button&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="43432-hiveserver2-process-disable-alert.png" style="width: 1225px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19215iC07A957A92A0B24E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="43432-hiveserver2-process-disable-alert.png" alt="43432-hiveserver2-process-disable-alert.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;Then after restarting the ambari agent check the ambari-agent log again.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 09:00:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181525#M143730</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2019-08-18T09:00:33Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181526#M143731</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@uri ben-ari&lt;BR /&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Also can you please share the complete&lt;STRONG&gt; "/var/log/ambari-agent/ambari-agent.log" &lt;/STRONG&gt;to see if there is any other issue which is causing ambari-agent to not come up.&lt;/P&gt;&lt;P&gt;.&lt;BR /&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 16:51:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181526#M143731</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2017-11-05T16:51:49Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181527#M143732</link>
      <description>&lt;P&gt;when we run the netstat -tnlpa | grep 10000    , we get             &lt;/P&gt;&lt;P&gt;tcp        0      0 45.89.12.111:10000    45.89.12.110:44570    ESTABLISHED 15598/java&lt;/P&gt;&lt;P&gt;       
tcp        0      0 45.89.12.111:10000    45.89.12.110:55109    ESTABLISHED 15598/java&lt;/P&gt;&lt;P&gt;regarding the iptables it is stooped , and we get the output - connection refused from nc command , and the full machine name is - machine-master03.pop.com&lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 17:22:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181527#M143732</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2017-11-05T17:22:22Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181528#M143733</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@uri ben-ari&lt;BR /&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As you mentioned that the "nc" command output if not connecting which indicates that the N/W  (firewall issue)   OR incorrect hostmame mapping.&lt;/P&gt;&lt;P&gt;I see that my HiveServer2 process is bound to all interfaces as following:&lt;/P&gt;&lt;PRE&gt;# netstat -tnlpa | grep 10000
tcp        0      0 0.0.0.0:10000               0.0.0.0:*                   LISTEN      1690/java&lt;/PRE&gt;&lt;P&gt;. &lt;/P&gt;&lt;P&gt;Can you please check if your "ambari-agent" host machine (and other hosts of the cluster) has correct IP/Hostname mapping inside their "/etc/hosts" file to point to the HiveServer2 host.&lt;/P&gt;&lt;PRE&gt;# cat /etc/hosts | grep 'machine-master03.pop.com'&lt;BR /&gt;45.89.12.111	machine-master03.pop.com&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;In the previously mentioned StackTrace i see that the hostname was different "machine-&lt;STRONG&gt;master01&lt;/STRONG&gt;.pop.com" but in yoru recent comment i see that you mentioned hostname as "machine-&lt;STRONG&gt;master03&lt;/STRONG&gt;.pop.com"    &lt;/P&gt;&lt;PRE&gt;[Alert][hive_server_process] Failed with result CRITICAL: ['Connection failed on host machine-master01.pop.com:10000 &lt;BR /&gt;&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;So please check if the IP Address and Hostname mapping is correct in yoru "/etc/hosts" file.&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 17:38:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181528#M143733</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2017-11-05T17:38:24Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181529#M143734</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@uri ben-ari&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As you mentioned that the "nc" command output if not connecting which
 indicates that the N/W  (firewall issue)   OR incorrect hostmame 
mapping.&lt;/P&gt;&lt;P&gt;I see that my HiveServer2 process is bound to all interfaces as following:&lt;/P&gt;&lt;PRE&gt;# netstat -tnlpa | grep 10000
tcp  0  0 0.0.0.0:10000  0.0.0.0:*  LISTEN  1690/java&lt;/PRE&gt;&lt;P&gt;. &lt;/P&gt;&lt;P&gt;Can
 you please check if your "ambari-agent" host machine (and other hosts 
of the cluster) has correct IP/Hostname mapping inside their 
"/etc/hosts" file to point to the HiveServer2 host.&lt;/P&gt;&lt;PRE&gt;# cat /etc/hosts | grep 'machine-master03.pop.com'&lt;BR /&gt;45.89.12.111	machine-master03.pop.com&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;In the previously mentioned StackTrace i see that the hostname was different "machine-&lt;STRONG&gt;master01&lt;/STRONG&gt;.pop.com" but in yoru recent comment i see that you mentioned hostname as "machine-&lt;STRONG&gt;master03&lt;/STRONG&gt;.pop.com".&lt;/P&gt;&lt;P&gt;So please check if the IP Address and Hostname mapping is correct in your&lt;STRONG&gt; "/etc/hosts"&lt;/STRONG&gt; file.&lt;/P&gt;&lt;PRE&gt;[Alert][hive_server_process] Failed with result CRITICAL: ['Connection failed on host machine-master01.pop.com:10000 &lt;/PRE&gt;</description>
      <pubDate>Sun, 05 Nov 2017 17:42:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181529#M143734</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2017-11-05T17:42:01Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181530#M143735</link>
      <description>&lt;P&gt;yes the IP's and hostname are now ok , but still cant start the ambari-agent do you think need to restart the proccess that hold the port - 10000 ?  &lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 18:24:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181530#M143735</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2017-11-05T18:24:23Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181531#M143736</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@uri ben-ari&lt;BR /&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Yes, please try restarting HiveServer2 process to see if it is coming up fine and no errors are observed in the hiveserevr2 logs.   Also we can check if the port 10000 started successfully or not.&lt;/P&gt;&lt;P&gt;The we can try restarting the agent to see if it starts fine.&lt;/P&gt;&lt;P&gt;It agent startup still fails then please share the *complete* ambari-agent logs.&lt;BR /&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 18:34:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181531#M143736</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2017-11-05T18:34:35Z</dc:date>
    </item>
    <item>
      <title>Re: ambari-agent cant start</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181532#M143737</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@uri ben-ari&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Yes, We can kill if other users are not using HiveServer2 (just to be sure that they are not running any job)&lt;/P&gt;&lt;PRE&gt;# cat /var/run/hive/hive-server.pid 
# ps -ef | grep `cat /var/run/hive/hive-server.pid`
# netstat -tnlpa | grep `cat /var/run/hve/hive-server.pid`
# kill -9 `cat /var/run/hive/hive-server.pid`&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;Above commands like cat &amp;amp; ps are to confirm if we are killing the correct process.&lt;/P&gt;</description>
      <pubDate>Sun, 05 Nov 2017 18:54:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ambari-agent-cant-start/m-p/181532#M143737</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2017-11-05T18:54:45Z</dc:date>
    </item>
  </channel>
</rss>

