Member since
01-25-2016
3
Posts
0
Kudos Received
0
Solutions
02-01-2016
05:12 AM
The issue is resolved. I replaced the chrony service with the NTP service, according to Michali's recommendation, on all my hosts and all errors stopped. Not only the errors which where explicitely stating "Failed to collect NTP metrics" but also all other errors. Apparently all these errors where somehow related to the inability to collect NTP metrics. Thank you!
... View more
02-01-2016
01:06 AM
Thank you Michalis for your quick response. Regarding the first issue you mention that it is related to NTP. I use RHEL 7.1 for operating system which uses the chrony service by default instead of NTP. Do you recommend to replace the chrony service with the ntp service? Regarding the second issue i am providing screenshots from three different services where this issue occurs a) from the Host Monitor [01/Feb/2016 10:44:34 +0000] 1237 Monitor-GenericMonitor throttling_logger ERROR (8 skipped) Error fetching metrics at 'http://host-hd-01.corp.nodalpoint.com:8086/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/generic/metric_collectors.py", line 165, in collect_and_parse
simplejson.load(opened_url))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/simplejson-2.1.2-py2.7-linux-x86_64.egg/simplejson/__init__.py", line 324, in load
return loads(fp.read(),
File "/usr/lib64/python2.7/socket.py", line 351, in read
data = self._sock.recv(rbufsize)
File "/usr/lib64/python2.7/httplib.py", line 567, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
error: [Errno 9] Bad file descriptor with its corresponding screenshot b) from a Yarn Node Manager [01/Feb/2016 10:37:19 +0000] 1363 Monitor-GenericMonitor throttling_logger ERROR (6 skipped) Error fetching metrics at 'http://host-hd-03.corp.nodalpoint.com:8042/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/generic/metric_collectors.py", line 165, in collect_and_parse
simplejson.load(opened_url))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/simplejson-2.1.2-py2.7-linux-x86_64.egg/simplejson/__init__.py", line 324, in load
return loads(fp.read(),
File "/usr/lib64/python2.7/socket.py", line 351, in read
data = self._sock.recv(rbufsize)
File "/usr/lib64/python2.7/httplib.py", line 567, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
error: [Errno 9] Bad file descriptor with its corresponding screenshot c) and from the Name Node [01/Feb/2016 10:53:34 +0000] 1237 Monitor-GenericMonitor throttling_logger ERROR (1 skipped) Error fetching metrics at 'http://host-hd-01.corp.nodalpoint.com:8087/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/generic/metric_collectors.py", line 165, in collect_and_parse
simplejson.load(opened_url))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/simplejson-2.1.2-py2.7-linux-x86_64.egg/simplejson/__init__.py", line 324, in load
return loads(fp.read(),
File "/usr/lib64/python2.7/socket.py", line 351, in read
data = self._sock.recv(rbufsize)
File "/usr/lib64/python2.7/httplib.py", line 567, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
error: [Errno 9] Bad file descriptor with its corresponding screenshot Please tell me if you require any more information Thanks again for your support Filaretos
... View more
01-29-2016
06:59 AM
Hi all, I have successfully installed Cloudera Manager 5.5.1 on a private cluster with only HDFS, YARN and Spark. I keep getting Health Issues every 10 - 15 minutes reporting "Web Server Status : The Cloudera Manager Agent got an unexpected response from this role's web server." the corresponding entry in the host 's cloudera agent is the following [29/Jan/2016 16:51:32 +0000] 1237 Monitor-HostMonitor throttling_logger ERROR (30 skipped) Failed to collect NTP metrics
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 37, in collect
result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout)
File "/usr/lib64/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 30, in _subprocess_with_timeout
return subprocess_with_timeout(args, timeout)
File "/usr/lib64/cmf/agent/src/cmf/subprocess_timeout.py", line 49, in subprocess_with_timeout
p = subprocess.Popen(**kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory And another one [29/Jan/2016 16:48:32 +0000] 1237 Monitor-GenericMonitor throttling_logger ERROR (1 skipped) Error fetching metrics at 'http://host-hd-01.corp.nodalpoint.com:8086/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/generic/metric_collectors.py", line 165, in collect_and_parse
simplejson.load(opened_url))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/simplejson-2.1.2-py2.7-linux-x86_64.egg/simplejson/__init__.py", line 324, in load
return loads(fp.read(),
File "/usr/lib64/python2.7/socket.py", line 351, in read
data = self._sock.recv(rbufsize)
File "/usr/lib64/python2.7/httplib.py", line 567, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
error: [Errno 9] Bad file descriptor Has anyone else noticed similar issues? Thank you
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
-
Cloudera Manager
-
HDFS