Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Ambari Metric Collector: Error sending metric to server. timed out

Solved Go to solution

Ambari Metric Collector: Error sending metric to server. timed out

Explorer

Had a disk full issue. After making some space in /var then trying to restart Metric Collector from Ambari, got error:

----------error ---------------

2015-11-24 11:35:54,281 [INFO] controller.py:110 - Adding event to cache,  : {u'metrics': [], u'collect_every': u'15'}
2015-11-24 11:35:54,281 [INFO] main.py:65 - Starting Server RPC Thread: /usr/lib/python2.6/site-packages/resource_monitoring/main.py start
2015-11-24 11:35:54,281 [INFO] controller.py:57 - Running Controller thread: Thread-1
2015-11-24 11:35:54,282 [INFO] emitter.py:45 - Running Emitter thread: Thread-2
2015-11-24 11:35:54,282 [INFO] emitter.py:65 - Nothing to emit, resume waiting.
2015-11-24 11:36:54,283 [INFO] emitter.py:91 - server: http://xxxxxxx.com:6188/ws/v1/timeline/metrics
2015-11-24 11:37:44,334 [WARNING] emitter.py:74 - Error sending metrics to server. timed out
2015-11-24 11:37:44,334 [WARNING] emitter.py:80 - Retrying after 5 ...
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

I just had this issue and this is how it was solved.

I added this to ams-hbase-site :: hbase.zookeeper.property.tickTime = 6000 and then restarted AMS

View solution in original post

18 REPLIES 18
Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

@Mike Li

netstat -anp | grep 6188

Whats the output of the above command? Can you post more data from the log?

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

Explorer

Nothing is returned from from netstat -anp |grep 6188 when running on namenode. On Ambari server node (which is also our edge node), the output from the command is:

[ambari@xxxxx~]$ netstat -anp |grep 6188
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
unix  3  [ ]  STREAM  CONNECTED  215461888 -  /var/lib/sss/pipes/private/sbus-dp_centene.com.39654
unix  3  [ ]  STREAM  CONNECTED  215461883 -  /var/lib/sss/pipes/private/sbus-dp_centene.com.39654

Thank you, Neeraj for your quick response.

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

@Mike Li

Try to restart the AMS service and run tail -f on AMS logs to check the exact messages while its crashing

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

Make sure the Metrics Collector process is up and running on port 6188 to Neeraj's point. Once the Metrics Collector process is up and running, the Metrics Monitor's should re-connect and start sending metrics.

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

Explorer
When trying to restart AMS from Ambari UI, I check the AMS log file: ambar-metric-monitor.out on Ambari sever host node:
-----------------------------------------
2015-11-24 13:47:30,987 [WARNING] emitter.py:80 - Retrying after 5 ...
2015-11-24 13:48:35,989 [INFO] emitter.py:91 - server: http://xxxx06t.xxxx.com:6188/ws/v1/timeline/metrics
2015-11-24 13:48:35,990 [WARNING] emitter.py:74 - Error sending metrics to server. <urlopen error [Errno 111] Connection refused>
on the another node, which is the primary namenode and also as one of zookeeper nodes, the log file  of AMS :
------------------------------------------------------------------------------------------------------------
2015-11-24 13:43:50,822 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
2015-11-24 13:43:50,822 WARN org.apache.zookeeper.ClientCnxn: Session 0x1513af962d30005 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
  at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2015-11-24 13:43:51,217 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
2015-11-24 13:43:51,217 WARN org.apache.zookeeper.ClientCnxn: Session 0x1513af962d30005 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
  at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
The hbase-ams-master-xxx06t.xxxx.com.log has the same error message:
--------------------------------------------------------------------------------------------
2015-11-24 13:30:10,202 WARN  [main-SendThread(localhost:61181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
  at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2015-11-24 13:30:10,206 WARN  [main] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

@Mike Li Is ZK up ?

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

Explorer

using ruok command to check, all 3 zookeeper processes are up and running:

./check_zookeeper.ksh imok imok imok

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

Highlighted

Re: Ambari Metric Collector: Error sending metric to server. timed out

@Mike Li embeded HBASE instance is down.

Don't have an account?
Coming from Hortonworks? Activate your account here