Created 12-29-2018 02:14 AM
Hi,
I have a issue in CDH cluster。The cluster has been running a few days。But Today, in CM Host Tab,The Host Machine Health Test show :
This host is in contact with the Cloudera Manager Server. This host is not in contact with the Host Monitor.
The Health History show Error then becomes healthy。
CM Agent log:
[29/Dec/2018 16:42:11 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:12 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:13 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:18 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:20 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:20 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:21 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:24 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:24 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:25 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:25 +0000] 24147 MonitorDaemon-Reporter firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:39 +0000] 24147 MainThread agent WARNING Long HB processing time: 6.51629519463
[29/Dec/2018 16:42:53 +0000] 24147 MainThread agent WARNING Long HB processing time: 5.32933592796
[29/Dec/2018 16:45:06 +0000] 24147 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.07 min:0.10 mean:0.29 max:0.88 LIFE_MAX:0.72
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter throttling_logger ERROR (3 skipped) Error sending messages to firehose: mgmt-SERVICEMONITOR-02008505edb7b85b2295119db7eba412
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/firehose.py", line 125, in _send
self._requestor.request('sendAgentMessages', dict(messages=UNICODE_SANITIZER(messages)))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 141, in request
return self.issue_request(call_request, message_name, request_datum)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 254, in issue_request
call_response = self.transceiver.transceive(call_request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 482, in transceive
self.write_framed_message(request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 501, in write_framed_message
self.conn.request(req_method, self.req_resource, req_body, req_headers)
File "/usr/lib64/python2.7/httplib.py", line 1017, in request
self._send_request(method, url, body, headers)
File "/usr/lib64/python2.7/httplib.py", line 1051, in _send_request
self.endheaders(body)
File "/usr/lib64/python2.7/httplib.py", line 1013, in endheaders
self._send_output(message_body)
File "/usr/lib64/python2.7/httplib.py", line 864, in _send_output
self.send(msg)
File "/usr/lib64/python2.7/httplib.py", line 840, in send
self.sock.sendall(data)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 32] Broken pipe
[29/Dec/2018 16:45:08 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9995<<<<<<<<<<<<<<
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9995<<<<<<<<<<<<<<
[29/Dec/2018 16:45:09 +0000] 24147 MainThread agent WARNING Long HB processing time: 6.45350694656
[29/Dec/2018 16:45:09 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:09 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:10 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:15 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:17 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:18 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:19 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:21 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:21 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:22 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<the address and port is my re-edit firehouse.py to print ,I telnet the hostname and port are good。
then in /var/log/cloudera-scm-firehose I can't found the error log。
I search like issue:https://community.cloudera.com/t5/Cloudera-Altus-Director/MonitorDaemon-Reporter-throttling-logger-E...
try to add Service Monitor and Host Monitor Memory,but not sloved.
please help me!
the CM version : 5.16.1
machine config As shown:
Created 01-01-2019 10:54 PM
I didn't find the re-editing option, and all the images showed an error. So I describe my problem again.
Hi,
I have an issue in CDH cluster. The cluster has been running for a few days.But Today, in CM Host Tab,The Host Machine Health Test shows :
Agent Status Suppress... This host is in contact with the Cloudera Manager Server. This host is not in contact with the Host Monitor.
The Health History show Error then becomes healthy.
2:30 PM 3 Became Good Show 2:29:03 PM Network Interface Speed Unknown 1 Still Bad Show 2:28:58 PM 1 Became Bad 1 Became Unknown Show 2:26 PM 3 Became Good Show 2:25:18 PM Network Interface Speed Unknown 1 Still Bad Show 2:25:13 PM 1 Became Bad 1 Became Unknown Show 2:23 PM 3 Became Good Show
CM Agent log:
[29/Dec/2018 16:42:11 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:12 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:13 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:18 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:20 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:20 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:21 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:24 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:24 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:25 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:25 +0000] 24147 MonitorDaemon-Reporter firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:42:39 +0000] 24147 MainThread agent WARNING Long HB processing time: 6.51629519463
[29/Dec/2018 16:42:53 +0000] 24147 MainThread agent WARNING Long HB processing time: 5.32933592796
[29/Dec/2018 16:45:06 +0000] 24147 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.07 min:0.10 mean:0.29 max:0.88 LIFE_MAX:0.72
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter throttling_logger ERROR (3 skipped) Error sending messages to firehose: mgmt-SERVICEMONITOR-02008505edb7b85b2295119db7eba412
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/firehose.py", line 125, in _send
self._requestor.request('sendAgentMessages', dict(messages=UNICODE_SANITIZER(messages)))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 141, in request
return self.issue_request(call_request, message_name, request_datum)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 254, in issue_request
call_response = self.transceiver.transceive(call_request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 482, in transceive
self.write_framed_message(request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 501, in write_framed_message
self.conn.request(req_method, self.req_resource, req_body, req_headers)
File "/usr/lib64/python2.7/httplib.py", line 1017, in request
self._send_request(method, url, body, headers)
File "/usr/lib64/python2.7/httplib.py", line 1051, in _send_request
self.endheaders(body)
File "/usr/lib64/python2.7/httplib.py", line 1013, in endheaders
self._send_output(message_body)
File "/usr/lib64/python2.7/httplib.py", line 864, in _send_output
self.send(msg)
File "/usr/lib64/python2.7/httplib.py", line 840, in send
self.sock.sendall(data)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 32] Broken pipe
[29/Dec/2018 16:45:08 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9995<<<<<<<<<<<<<<
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9995<<<<<<<<<<<<<<
[29/Dec/2018 16:45:09 +0000] 24147 MainThread agent WARNING Long HB processing time: 6.45350694656
[29/Dec/2018 16:45:09 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:09 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:10 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:15 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:17 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:18 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:19 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:21 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:21 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<
[29/Dec/2018 16:45:22 +0000] 24147 ImpalaDaemonQueryMonitoring firehose INFO >>>>>>>>>>>>>>>>address : hadoop05.ddxq.idc port: 9997<<<<<<<<<<<<<<I re-edited firehouse.py to print the address and port, and the telnet hostname and port are good.
then in /var/log/cloudera-scm-firehose I can't found the error log.
I search like issue:https://community.cloudera.com/t5/Cloudera-Altus-Director/MonitorDaemon-Reporter-throttling-logger-E...
try to add Service Monitor and Host Monitor Memory,but not sloved.
please help me!
the CM version:5.16.1
machine system version: centos 7.3.1611
Created 01-02-2019 11:02 AM
Thanks for discussing the issue you are facing.
This issue does not pose any threat to the functionality of your cluster, so the impact should be minimal.
Let's start with what we know:
- agent attempts to make an HTTP connection and gets the following:
[29/Dec/2018 16:45:08 +0000] 24147 MonitorDaemon-Reporter throttling_logger ERROR (3 skipped) Error sending messages to firehose: mgmt-SERVICEMONITOR-02008505edb7b85b2295119db7eba412
. . . .
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 32] Broken pipe
This tells us a couple of important things:
(1)
The agent was able to make a TCP connection to the Host Monitor on port 9555 (the Host Monitor Listen Port). After that, the communication was severed as the connection appears to have been dropped somewhere between the agent and the Host Monitor Server
(2)
This appears to happen frequently.
The agent will periodically capture host information and then upload that to the Host Monitor for indexing and storage. The first place to look would be the Host Monitor log in /var/log/:
mgmt-cmf-mgmt-HOSTMONITOR*
You mentioned you could not find the error log but there is only the one log for all log information.
If you do not see anything wrong there, then we might need to employ more advanced diagnostics to determine what is happening.
Created 01-02-2019 07:14 PM
2019-01-03 07:44:01,029 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from raw to rollup=TEN_MINUTELY for rollupTimestamp=2019-01-02T23:40:00.000Z 2019-01-03 07:44:02,042 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT1.013S, numStreamsChecked=42228, numStreamsRolledUp=855 2019-01-03 07:48:01,011 INFO com.cloudera.cmon.tstore.leveldb.LDBResourceManager: Closed: 0 partitions 2019-01-03 07:49:01,028 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-02T23:49:01.028Z, forMigratedData=false 2019-01-03 07:54:01,028 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-02T23:54:01.028Z, forMigratedData=false 2019-01-03 07:54:01,029 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from raw to rollup=TEN_MINUTELY for rollupTimestamp=2019-01-02T23:50:00.000Z 2019-01-03 07:54:01,949 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.920S, numStreamsChecked=42228, numStreamsRolledUp=855 2019-01-03 07:58:01,013 INFO com.cloudera.cmon.tstore.leveldb.LDBResourceManager: Closed: 0 partitions 2019-01-03 07:59:01,028 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-02T23:59:01.028Z, forMigratedData=false 2019-01-03 07:59:29,591 INFO com.cloudera.cmf.BasicScmProxy: Failed request to SCM: 302 2019-01-03 07:59:30,591 INFO com.cloudera.cmf.BasicScmProxy: Authentication to SCM required. 2019-01-03 07:59:30,668 INFO com.cloudera.cmf.BasicScmProxy: Using encrypted credentials for SCM 2019-01-03 07:59:30,674 INFO com.cloudera.cmf.BasicScmProxy: Authenticated to SCM. 2019-01-03 08:04:01,029 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T00:04:01.029Z, forMigratedData=false 2019-01-03 08:04:01,029 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from raw to rollup=TEN_MINUTELY for rollupTimestamp=2019-01-03T00:00:00.000Z 2019-01-03 08:04:01,864 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.833S, numStreamsChecked=42228, numStreamsRolledUp=855 2019-01-03 08:04:01,864 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from ts_stream_rollup_PT600S to rollup=HOURLY for rollupTimestamp=2019-01-03T00:00:00.000Z 2019-01-03 08:04:02,689 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.825S, numStreamsChecked=42228, numStreamsRolledUp=859 2019-01-03 08:04:02,690 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from ts_stream_rollup_PT3600S to rollup=SIX_HOURLY for rollupTimestamp=2019-01-03T00:00:00.000Z 2019-01-03 08:04:03,448 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.758S, numStreamsChecked=42228, numStreamsRolledUp=859 2019-01-03 08:04:03,448 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from ts_stream_rollup_PT21600S to rollup=DAILY for rollupTimestamp=2019-01-03T00:00:00.000Z 2019-01-03 08:04:03,939 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.491S, numStreamsChecked=42228, numStreamsRolledUp=2061 2019-01-03 08:04:03,939 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from ts_stream_rollup_PT86400S to rollup=WEEKLY for rollupTimestamp=2019-01-03T00:00:00.000Z 2019-01-03 08:04:04,404 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.464S, numStreamsChecked=42228, numStreamsRolledUp=2061 2019-01-03 08:09:01,010 INFO com.cloudera.cmon.tstore.leveldb.LDBResourceManager: Closed: 0 partitions 2019-01-03 08:09:01,029 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T00:09:01.029Z, forMigratedData=false 2019-01-03 08:14:01,030 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T00:14:01.030Z, forMigratedData=false 2019-01-03 08:14:01,030 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from raw to rollup=TEN_MINUTELY for rollupTimestamp=2019-01-03T00:10:00.000Z 2019-01-03 08:14:01,961 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT0.929S, numStreamsChecked=42228, numStreamsRolledUp=859 2019-01-03 08:19:01,014 INFO com.cloudera.cmon.tstore.leveldb.LDBResourceManager: Closed: 0 partitions 2019-01-03 08:19:01,030 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T00:19:01.030Z, forMigratedData=false 2019-01-03 08:24:01,030 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T00:24:01.030Z, forMigratedData=false 2019-01-03 08:24:01,031 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from raw to rollup=TEN_MINUTELY for rollupTimestamp=2019-01-03T00:20:00.000Z 2019-01-03 08:24:02,274 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT1.243S, numStreamsChecked=42228, numStreamsRolledUp=859 2019-01-03 08:29:01,031 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T00:29:01.031Z, forMigratedData=false 2019-01-03 08:30:01,034 INFO com.cloudera.cmon.tstore.leveldb.LDBResourceManager: Closed: 0 partitionsThe Service Monitor log:
2019-01-03 09:07:01,035 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1680dc02bc4071d 2019-01-03 09:07:16,122 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT167.412S, numStreamsChecked=6320720, numStreamsRolledUp=7678 2019-01-03 09:07:16,122 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Starting rollup from ts_stream_rollup_PT600S to rollup=HOURLY for rollupTimestamp=2019-01-03T01:00:00.000Z 2019-01-03 09:07:54,966 INFO com.cloudera.cmon.tstore.leveldb.LDBResourceManager: Closed: 0 partitions 2019-01-03 09:08:06,015 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x3ef8132e connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:08:06,020 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ReplicationAdmin connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:08:06,025 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3680dc027f7070d 2019-01-03 09:08:43,137 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Finished rollup: duration=PT87.015S, numStreamsChecked=6320720, numStreamsRolledUp=7681 2019-01-03 09:09:06,021 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0xf496593 connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:09:06,025 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ReplicationAdmin connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:09:06,029 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3680dc027f7070f 2019-01-03 09:09:28,710 INFO com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRollupManager: Running the LDBTimeSeriesRollupManager at 2019-01-03T01:09:28.710Z, forMigratedData=false 2019-01-03 09:10:06,019 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4cbadc22 connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:10:06,022 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ReplicationAdmin connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:10:06,026 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1680dc02bc40724 2019-01-03 09:11:06,026 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x43eb1654 connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:11:06,031 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ReplicationAdmin connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:11:06,035 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1680dc02bc40727 2019-01-03 09:11:26,101 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop01.ddxq.idc:9083 2019-01-03 09:11:26,101 INFO hive.metastore: Opened a connection to metastore, current connections: 1 2019-01-03 09:11:26,103 INFO hive.metastore: Connected to metastore. 2019-01-03 09:11:26,812 INFO hive.metastore: Closed a connection to metastore, current connections: 0 2019-01-03 09:12:06,078 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x7b46b68a connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:12:06,082 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ReplicationAdmin connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181 2019-01-03 09:12:06,087 INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3680dc027f70713 2019-01-03 09:13:11,034 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x6359e1f1 connecting to ZooKeeper ensemble=hadoop05.ddxq.idc:2181,hadoop03.ddxq.idc:2181,hadoop04.ddxq.idc:2181No error messages.
Created 01-03-2019 10:21 AM
I agree that it is possible (and even likely) that the issue is more on the agent side. What we really need to see is what the agent is doing when at the time it is having trouble connecting. A great way to have a peek at the internals is to send a SIGQUIT signal to the agent which will trigger it to dump thread stacks to the agent. If you could run this a few times while CM is showing that the agent is out of contact with the host monitor, it might give us some clues if the agent is under stress at that time or not.
Created 01-03-2019 10:23 AM
Oops... clicked "POST" before telling you how to get the agent to dump the thread stacks to the agent log. You can run the following:
kill -SIGQUIT `cat /var/run/cloudera-scm-agent/cloudera-scm-agent.pid`
This will not cause the agent to restart or anything so it won't impact processing.
If you can run the kill -SIGQUIT a few times that would give us an idea of how the threads are progressing.
Created 01-03-2019 07:03 PM
@bgooleyI ran this command.
Then the log in /var/log/cloudera-scm-agent/cloudera-scm-agent.log as follows:
Dumping all Thread Stacks ...
# Thread: Monitor-GenericMonitor(140497188263680)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496701748992)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495628007168)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496676570880)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-9(140497741920000)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496651392768)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495091136256)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: MonitorDaemon-Reporter(140497213441792)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 50, in run
self._fn(*self._args, **self._kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/__init__.py", line 163, in _report
self._report_for_monitors(monitors)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/__init__.py", line 214, in _report_for_monitors
self.firehoses.send_smon_update(service_updates, role_updates, None)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/firehoses.py", line 149, in send_smon_update
impala_query_updates=impala_query_updates)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/firehoses.py", line 181, in _send_agent_message
firehose.send(dict(agent_msgs=[agentmsg]))
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/firehose.py", line 107, in send
self._send(messages)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/firehose.py", line 124, in _send
self._requestor.request('sendAgentMessages', dict(messages=UNICODE_SANITIZER(messages)))
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 136, in request
self.write_call_request(message_name, request_datum, buffer_encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 178, in write_call_request
self.write_request(message.request, request_datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 182, in write_request
datum_writer.write(request_datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 770, in write
self.write_data(self.writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 801, in write_data
self.write_record(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 801, in write_data
self.write_record(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 795, in write_data
self.write_array(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 839, in write_array
self.write_data(writers_schema.items, item, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 801, in write_data
self.write_record(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 795, in write_data
self.write_array(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 839, in write_array
self.write_data(writers_schema.items, item, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 801, in write_data
self.write_record(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 799, in write_data
self.write_union(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 879, in write_union
self.write_data(writers_schema.schemas[index_of_schema], datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 795, in write_data
self.write_array(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 839, in write_array
self.write_data(writers_schema.items, item, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 801, in write_data
self.write_record(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 795, in write_data
self.write_array(writers_schema, datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 839, in write_array
self.write_data(writers_schema.items, item, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 804, in write_data
raise schema.AvroException(fail_msg)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 804, in write_data
raise schema.AvroException(fail_msg)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 840, in write_array
encoder.write_long(0)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 804, in write_data
raise schema.AvroException(fail_msg)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 889, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 804, in write_data
raise schema.AvroException(fail_msg)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 879, in write_union
self.write_data(writers_schema.schemas[index_of_schema], datum, encoder)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 804, in write_data
raise schema.AvroException(fail_msg)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 303, in write_int
self.write_long(datum);
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 313, in write_long
self.write(chr(datum))
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/io.py", line 281, in write
self.writer.write(datum)
# Thread: HTTPServer Thread-2(140498278790912)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/process/servers.py", line 187, in _start_http_thread
self.httpserver.start()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1838, in start
self.tick()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1950, in tick
return
File: "/usr/lib64/python2.7/socket.py", line 202, in accept
sock, addr = self._sock.accept()
# Thread: Monitor-GenericMonitor(140495577650944)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495586043648)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495594436352)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: DnsResolutionMonitor(140497221834496)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/stoppable_thread.py", line 34, in run
time.sleep(sleep)
# Thread: Monitor-GenericMonitor(140495602829056)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496148092672)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496156485376)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-HostMonitor(140497230227200)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: MainThread(140498665850688)
File: "/usr/lib64/cmf/agent/build/env/bin/cmf-agent", line 12, in <module>
load_entry_point('cmf==5.16.1', 'console_scripts', 'cmf-agent')()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/agent.py", line 3127, in main
main_impl()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/agent.py", line 3110, in main_impl
agent.start()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/agent.py", line 852, in start
self.__issue_heartbeat()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/agent.py", line 754, in __issue_heartbeat
heartbeat_response = self.send_heartbeat(heartbeat)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/agent.py", line 1401, in send_heartbeat
response = self._send_heartbeat(heartbeat)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/agent.py", line 1442, in _send_heartbeat
response = self.requestor.request('heartbeat', heartbeat_data)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 141, in request
return self.issue_request(call_request, message_name, request_datum)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 254, in issue_request
call_response = self.transceiver.transceive(call_request)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 483, in transceive
result = self.read_framed_message()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 487, in read_framed_message
response = self.conn.getresponse()
File: "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File: "/usr/lib64/python2.7/httplib.py", line 476, in begin
self.msg = HTTPMessage(self.fp, 0)
File: "/usr/lib64/python2.7/mimetools.py", line 25, in __init__
rfc822.Message.__init__(self, fp, seekable)
File: "/usr/lib64/python2.7/rfc822.py", line 108, in __init__
self.readheaders()
File: "/usr/lib64/python2.7/httplib.py", line 315, in readheaders
line = self.fp.readline(_MAXLINE + 1)
File: "/usr/lib64/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/util/__init__.py", line 193, in dumpstacks
for filename, lineno, name, line in traceback.extract_stack(stack):
# Thread: Monitor-GenericMonitor(140496131307264)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: _TimeoutMonitor(140498287183616)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/process/plugins.py", line 471, in run
time.sleep(self.interval)
# Thread: Monitor-GenericMonitor(140495611221760)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496114521856)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495619614464)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496693356288)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-7(140497758705408)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495065958144)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CredentialManager(140498388784896)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/kt_renewer.py", line 181, in run
self._trigger.wait(_RENEWAL_PERIOD)
File: "/usr/lib64/python2.7/threading.py", line 361, in wait
_sleep(delay)
# Thread: Monitor-GenericMonitor(140496164878080)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496659785472)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496139699968)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Metadata-Plugin(140498303969024)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/util/__init__.py", line 489, in wrapper
return fn(self, *args, **kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/audit/navigator_thread.py", line 168, in _monitor_logs
time.sleep(event_poll_interval)
# Thread: CP Server Thread-11(140497725134592)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Profile-Plugin(140498295576320)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/util/__init__.py", line 489, in wrapper
return fn(self, *args, **kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/audit/navigator_thread.py", line 168, in _monitor_logs
time.sleep(event_poll_interval)
# Thread: Monitor-GenericMonitor(140496684963584)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Audit-Plugin(140498312361728)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/util/__init__.py", line 489, in wrapper
return fn(self, *args, **kwargs)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/audit/navigator_thread.py", line 168, in _monitor_logs
time.sleep(event_poll_interval)
# Thread: Thread-13(140497196656384)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/threadpool-1.2.7-py2.7.egg/threadpool.py", line 147, in run
request = self._requests_queue.get(True, self._poll_timeout)
File: "/usr/lib64/python2.7/Queue.py", line 177, in get
self.not_empty.wait(remaining)
File: "/usr/lib64/python2.7/threading.py", line 361, in wait
_sleep(delay)
# Thread: CP Server Thread-6(140497767098112)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-4(140498262005504)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140495057565440)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-8(140497750312704)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: Monitor-GenericMonitor(140496668178176)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-12(140497238619904)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-10(140497733527296)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
# Thread: ImpalaDaemonQueryMonitoring(140496122914560)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 871, in _check_for_queries
completed_query_profiles))
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 909, in _get_completed_query_profiles
return completed_query_ids, completed_query_profiles, True
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 601, in get_completed_queries
return completed_queries
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 484, in get_completed_queries
return next_start_datetime, next_last_file_timestamp, completed_query_profiles
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 313, in _get_completed_queries
return next_start_datetime, latest_file_timestamp, completed_queries
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/event_streamer.py", line 114, in __init__
self.__filtered_file_list = self.__apply_file_filter()
return filter_context["file_list"]
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsCommon-0.1-py2.7.egg/clusterstats/common/chain.py", line 25, in __call__
return True
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 103, in __call__
return True
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 108, in __set_start_offset
f.set_start_offset(event.get_offset())
return event
return event
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/event_reader.py", line 84, in get_prev_event
return event
return event, ''.join(prior_data)
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/file_line_reader.py", line 86, in get_next_line
return line
return data, block_offset
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/file_line_reader.py", line 203, in __read_data_till_next_newline
return line
# Thread: MonitorDaemon-Scheduler(140497205049088)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.1-py2.7.egg/cmf/monitor/wakeable_thread.py", line 34, in run
self._cv.wait(wait_time)
File: "/usr/lib64/python2.7/threading.py", line 361, in wait
_sleep(delay)
# Thread: CP Server Thread-5(140497775490816)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
# Thread: CP Server Thread-3(140498270398208)
File: "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File: "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1437, in run
conn = self.server.requests.get()
File: "/usr/lib64/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.7/threading.py", line 339, in wait
waiter.acquire()
Created 11-25-2019 04:28 AM
We are facing the exact similar issue.
Could you please help us understand how did you manage to resolve this issue?
thanks,
Pratik
Created 11-25-2019 09:03 AM
Hi @AstroPratik ,
First, in order for us to provide the best help, we need to make sure we have information about the issue you are observing. My guess is you are seeing the same health alert in Cloudera Manager, but we also need to confirm you are seeing the same messages in the agent log.
If so, please follow the instructions to provide a thread dump via the SIGQUIT signal. The instructions I provided for the "kill -SIGQUIT" command only work in Cloudera Manager 5.x. If you are using CM 6, you can use the following:
kill -SIGQUIT $(systemctl show -p MainPID cloudera-scm-agent.service 2>/dev/null | cut -d= -f2)
If you do run the kill SIGQUIT make sure to run it a couple times so we can compare snapshots AND make sure you get the thread dump when the problem is occurring.
NOTE: After reviewing the previous party's thread dump, it appears that a thread that is spawned to collect information for a diagnostic bundle is slow in processing; the thread that uploads service and host information to the Host and Service Monitor servers also seems to be slow.
Since the process of obtaining a diagnostic bundle is something that does not happen often, it is likely that the bundle creation is triggering the old event.
There are a number of possible causes for "firehose" trouble, though, so it is important that we understand the facts about your situation before making any judgements.