Support Questions

Find answers, ask questions, and share your expertise
Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

Heartbeat Lost on worker machine

avatar

we add recently the worker06 to the mabari cluster

after ambari-agent restart

we see that worker machine have heartbeat loos

from the ambari-agent log we can see the following:

before the ambari-agent restart worker machine heartbeat was ok ,

so what chould be the reson for that?

ERROR 2017-11-26 08:27:09,659 script_alert.py:123 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://work
er06.sys58.com:8042/ws/v1/node/info (Traceback (most recent call last):\n  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/packa
ge/alerts/alert_nodemanager_health.py", line 171, in execute\n    url_response = urllib2.urlopen(query, timeout=connection_timeout)\n  File "/usr/li
b64/python2.7/urllib2.py", line 154, in urlopen\n    return opener.open(url, data, timeout)\n  File "/usr/lib64/python2.7/urllib2.py", line 431, in
open\n    response = self._open(req, data)\n  File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n    \'_open\', req)\n  File "/usr/lib64/py
thon2.7/urllib2.py", line 409, in _call_chain\n    result = func(*args)\n  File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n    retu
rn self.do_open(httplib.HTTPConnection, req)\n  File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n    raise URLError(err)\nURLError: <u
rlopen error [Errno 111] Connection refused>\n)']
Michael-Bronson
1 ACCEPTED SOLUTION

avatar

the problem was solved , we see wrong configuration in host file /etc/hosts ( wrong host IP address )

and by edit the host file , we fixed also the DNS configuration , and this solved the problem

Michael-Bronson

View solution in original post

1 REPLY 1

avatar

the problem was solved , we see wrong configuration in host file /etc/hosts ( wrong host IP address )

and by edit the host file , we fixed also the DNS configuration , and this solved the problem

Michael-Bronson