Support Questions

Find answers, ask questions, and share your expertise

Failed to run DnsTest.

avatar
Explorer

we are facing two issues in this prod server DNS test fail and NTP server connection time out.

 

[17/Dec/2020 00:41:47 +0000] 9333 Monitor-HostMonitor throttling_logger ERROR    Timeout with args ['ntpq', '-np']
[17/Dec/2020 00:41:47 +0000] 9333 Monitor-HostMonitor throttling_logger ERROR    Failed to collect NTP metrics
[17/Dec/2020 00:42:08 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 01:31:26 +0000] 9333 Monitor-HostMonitor throttling_logger ERROR    Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/0
[17/Dec/2020 01:31:54 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 08:18:56 +0000] 9333 MonitorDaemon-Reporter throttling_logger ERROR    Error sending messages to firehose: mgmt-SERVICEMONITOR-b9bbe3508c15c97839a21fc44a6226b5
[17/Dec/2020 10:07:14 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 11:15:43 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Timeout with args ['/usr/java/jdk1.8.0_251-amd64/bin/java', '-classpath', '/opt/cloudera/cm/lib/agent-6.3.0.jar', 'com.cloudera.cmon.agent.DnsTest']
[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Failed to run DnsTest.
[17/Dec/2020 11:31:18 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 12:13:44 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[ PROD 

 

 

[17/Dec/2020 11:07:11 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 16.7383139133
[17/Dec/2020 11:07:23 +0000] 9333 Monitor-HostMonitor filesystem_map WARNING  Failed to join worker process collecting filesystem usage. All nodev filesystems will have unknown usage until the worker process is no longer active. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/0
[17/Dec/2020 11:15:29 +0000] 9333 MainThread agent        WARNING  Supervisor failed (pid 97042).  Restarting agent.
[17/Dec/2020 11:15:43 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 11:15:43 +0000] 9333 MainThread throttling_logger WARNING  Failed parsing alternatives line: libnssckbi.so.x86_64 string index out of range  link currently points to /usr/lib64/pkcs11/p11-kit-trust.so
[17/Dec/2020 11:15:48 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 5.60892701149
[17/Dec/2020 11:30:53 +0000] 9333 Monitor-HostMonitor filesystem_map WARNING  Failed to join worker process collecting filesystem usage. All nodev filesystems will have unknown usage until the worker process is no longer active. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/0
[17/Dec/2020 11:30:54 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 33.9636788368
[17/Dec/2020 11:30:54 +0000] 9333 MainThread agent        WARNING  Delayed HB: 19s since last
[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Timeout with args ['/usr/java/jdk1.8.0_251-amd64/bin/java', '-classpath', '/opt/cloudera/cm/lib/agent-6.3.0.jar', 'com.cloudera.cmon.agent.DnsTest']
[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Failed to run DnsTest.
[17/Dec/2020 11:31:09 +0000] 9333 MainThread agent        WARNING  Supervisor failed (pid 97042).  Restarting agent.
[17/Dec/2020 11:31:18 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 11:31:18 +0000] 9333 MainThread throttling_logger WARNING  Failed parsing alternatives line: libnssckbi.so.x86_64 string index out of range  link currently points to /usr/lib64/pkcs11/p11-kit-trust.so
[17/Dec/2020 11:31:23 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 5.59937500954
[17/Dec/2020 12:07:16 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 18.2336220741
[17/Dec/2020 12:07:16 +0000] 9333 MainThread agent        WARNING  Delayed HB: 3s since last
[17/Dec/2020 12:07:21 +0000] 9333 Monitor-HostMonitor filesystem_map WARNING  Failed to join worker process collecting filesystem usage. All nodev filesystems will have unknown usage until the worker process is no longer active. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/0
[17/Dec/2020 12:13:34 +0000] 9333 MainThread agent        WARNING  Supervisor failed (pid 97042).  Restarting agent.
[17/Dec/2020 12:13:44 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
[17/Dec/2020 12:13:44 +0000] 9333 MainThread throttling_logger WARNING  Failed parsing alternatives line: libnssckbi.so.x86_64 string index out of range  link currently points to /usr/lib64/pkcs11/p11-kit-trust.so
[17/Dec/2020 12:13:50 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 5.56325793266
[ PROD root@

 

 

 

 

 

[17/Dec/2020 11:30:53 +0000] 9333 Monitor-HostMonitor filesystem_map WARNING  Failed to join worker process collecting filesystem usage. All nodev filesystems will have unknown usage until the worker process is no longer active. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/0
[17/Dec/2020 11:30:54 +0000] 9333 MainThread agent        WARNING  Long HB processing time: 33.9636788368
[17/Dec/2020 11:30:54 +0000] 9333 MainThread agent        WARNING  Delayed HB: 19s since last
[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Timeout with args ['/usr/java/jdk1.8.0_251-amd64/bin/java', '-classpath', '/opt/cloudera/cm/lib/agent-6.3.0.jar', 'com.cloudera.cmon.agent.DnsTest']
None
[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Failed to run DnsTest.
Traceback (most recent call last):
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/host/dns_names.py", line 87, in collect_dns_metrics
    self._subprocess_with_timeout(args, self._poll_timeout)
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/host/dns_names.py", line 59, in _subprocess_with_timeout
    return subprocess_with_timeout(args, timeout)
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/subprocess_timeout.py", line 95, in subprocess_with_timeout
    raise Exception("timeout with args %s" % args)
Exception: timeout with args ['/usr/java/jdk1.8.0_251-amd64/bin/java', '-classpath', '/opt/cloudera/cm/lib/agent-6.3.0.jar', 'com.cloudera.cmon.agent.DnsTest']
[17/Dec/2020 11:31:09 +0000] 9333 MainThread agent        WARNING  Supervisor failed (pid 97042).  Restarting agent.
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     ================================================================================
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     SCM Agent Version: 6.3.0
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     Agent Protocol Version: 4
[17/Dec/2020 11:31:11 +0000] 9333 MainThread __init__     INFO     Agent UUID file was last modified at 2020-06-22 17:15:03.518251
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     Using Host ID: 2b537ad6-388e-4e32-bea2-7584f509d4df
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     Using directory: /run/cloudera-scm-agent
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     Using supervisor binary path: /opt/cloudera/cm-agent/bin/../bin/supervisord
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     Agent Logging Level: INFO
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO     Agent config:
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO          Security.use_tls        = 0
[17/Dec/2020 11:31:11 +0000] 9333 MainThread agent        INFO          Security.max_cert_depth = 9
[17/Dec/2020 11:3

 

 

 

 

 

[17/Dec/2020 10:07:14 +0000] 9333 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.
Traceback (most recent call last):
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1007, in _init_after_first_heartbeat_response
    self.inotify = self.repo.configure_inotify()
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/parcel.py", line 408, in configure_inotify
    wm = pyinotify.WatchManager()
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/pyinotify.py", line 1783, in __init__
    raise OSError(err % self._inotify_wrapper.str_errno())
OSError: Cannot initialize new instance of inotify, Errno=Too many open files (EMFILE)
[17/Dec/2020 10:07:14 +0000] 9333 MainThread downloader   INFO     Downloader path: /opt/cloudera/parcel-cache
[17/Dec/2020 10:07:14 +0000] 9333 MainThread parcel_cache INFO     Using /opt/cloudera/parcel-cache for parcel cache
[17/Dec/2020 10:07:14 +0000] 9333 MainThread throttling_logger WARNING  Failed parsing alternatives line: libnssckbi.so.x86_64 string index out of range  link currently points to /usr/lib64/pkcs11/p11-kit-trust.so
[

 

 

1 REPLY 1

avatar
Master Guru

@Raj77 The most potential error message is this:

OSError: Cannot initialize new instance of inotify, Errno=Too many open files (EMFILE)

This states that there are so many open file descriptor so that agent is not able to handle this. One easy way to mitigate this by hard stop/start agent using the doc

 

Then comes to the second error about DnsTest.

[17/Dec/2020 11:31:05 +0000] 9333 DnsResolutionMonitor throttling_logger ERROR    Failed to run DnsTest.

This seems and issue with your java installation most probably you should remove offending packages (mostly openjdk) from the host and some broken (zero bytes or red) alternatives from /var/lib/alternatives and /etc/alternatives, and restart the CM agent.


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.