We have ambari cluster , HDP version – 2.6.4 and ambari version – 2.6.1
We have a serious problem with the ambari agents on all machines in the cluster !!
For example lets take on machine – data node machine
On this machine we loose the heartbeat many times ( its mean agent stopped to talk with ambari server ) , therefore as workaround we restart the ambari agent and this fixed the issue , But its return again and again
Some times we see port in use from ambari agent log , so we kill the PID and restart the ambari agent ( we capture the PID from netstat and port number ) But this behavior return many times
Lets summary the case – seems this is maybe network problem or some interrupt that break the connection between ambari agent to ambari server , but we not have solution and we not sure about this
Please advice how to find the root cause?
ambari-agent]# grep "Ping port listener killed" ambari-agent.log INFO 2019-11-10 09:54:31,717 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 09:54:33,177 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 17:38:30,591 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 17:38:31,846 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 19:03:35,082 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 19:03:36,691 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 19:04:21,926 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 19:04:23,449 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 19:33:31,859 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-10 19:33:33,234 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 03:34:43,391 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 03:34:44,927 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 03:40:30,522 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 03:40:31,893 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 09:10:31,346 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 09:10:33,092 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 09:40:31,171 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 09:40:32,410 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 16:09:09,491 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 16:09:11,127 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 19:06:34,818 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 19:06:36,500 PingPortListener.py:61 - Ping port listener killed INFO 2019-11-11 20:07:07,720 PingPortListener.py:61 - Ping port listener killed