Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

YARN Critical Error

YARN Critical Error

New Contributor

91429-image.png

I want to ask a technical question about HDP. I've just installed an HDP sandbox on my debian server with docker. I have ecountered error like this, it says connection to port 8042 is refused. I couldn't find any solution on google, My 8042 port is available tested with telnet, but it says so.

First error says:

This alert is triggered if the number of down NodeManagers in the cluster is greater than the configured critical threshold. It aggregates the results of NodeManager process checks.

affected: [{1}], total: [{0}]

Second:

This host-level alert is triggered if the NodeManager Web UI is unreachable.

Instance response:
Connection failed to http://sandbox-hdp.hortonworks.com:8042 (<urlopen error [Errno 111] Connection refused>)

Third:

This host-level alert checks the node health property available from the NodeManager component.

Instance responce:

Connection failed to http://sandbox-hdp.hortonworks.com:8042/ws/v1/node/info (Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute
    url_response = urllib2.urlopen(query, timeout=connection_timeout)
  File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib64/python2.7/urllib2.py", line 431, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
    '_open', req)
  File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open
    raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
)

Thanks before.

1 REPLY 1

Re: YARN Critical Error

Super Mentor

@R Goldan

Your Screenshot suggest that the Alerts were 4 Days Old. So when you might have checked if the port 8042 was opened or not that time NodeManager might be running fine.
With Sandbox some times it can happen because we save the VM state and next time we resume the VM.

So can you please try this.. Click on the Alert definition link like "NodeManager Web UI" alert definition and then "Disable" the alert for around 30 seconds and then "Enable" it back. So that the alert gets re triggered and you will know the current state of the alert.

Ambari Also provides us an option to explicitly trigger (run) and alert using the API cal as following (Notice the "?run_now=true" part in the url)

# curl -u admin:admin -i -H 'X-Requested-By:ambari' -X PUT  http://node1.example.com:8080/api/v1/clusters/ClusterDemo/alert_definitions/151?run_now=true

. Here 151 will be your Alert Definition ID. When you click on the alert definition in the ambari UI then in the URL you see this ID like http://localhost:8080/#/main/alerts/151