Support Questions

Find answers, ask questions, and share your expertise

ambari server unable to connect to collector for hbase metrics - Read SocketTimeoutException (both on same host)

avatar
Contributor

When i login to the ui, metrics summary for hbase is not shown. for the rest like hdfs/storm the metrics show up.

collector and ambari server are on the same host.

The exception i get in server log is:

--------------------------------

DEBUG [qtp-ambari-client-26 - /api/v1/clusters/cluster_name/services/HBASE/components/HBASE_REGIONSERVER?_=1454039202858] MetricsRequestHelper:116 - Error getting timeline metrics : Read timed out

java.net.SocketTimeoutException: Read timed out

at java.net.SocketInputStream.socketRead0(Native Method)

--------------------------------

any help would be appreciated.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Also check for firewall/iptables issues.

View solution in original post

9 REPLIES 9

avatar
Master Mentor

@vishnu rao

Make sure that HBase and RS are up .

Restart the AMS

avatar
Contributor

hi Neeraj,

The Hbase ms and rs are up. i am able to see the summary. but not the graphs.

I seemed to have worked around the problem by deleting a widget. (read latency widget) and everything appeared.

even though 1 latency widget is not there.

i added a graph via ambari-grafana for read latency and the metric is now available there

avatar
Super Collaborator

Also check for firewall/iptables issues.

avatar
Contributor

Hi rahul.

The collector and server are on the same host and so is ambari grafana

As I mentioned, removal of a widget made everything else appear. I have added a graph in grafana for the metric widget I removed

avatar
Master Mentor

@vishnu rao

Firewall can affect monitors sending data to collector so please make sure all servers are allowed

avatar
Master Mentor

@vishnu rao That sounds like a good work around. I am focusing on this message "java.net.SocketTimeoutException: Read timed out"

Red time out can occur when there is network latency. Nice explanation

For example: Let's say everything is working fine and load in the node is very high , connection is trying to read the data for metric and it cant because there is no more room to do the processing.

avatar
Contributor

aye. agree. seems to be a load issue, but its kinda strange. will debug more.

avatar
Master Mentor

@vishnu rao Ok. Keep us posted. If it's load issue or network overloaded then I guess we do have a root cause.

avatar
Master Mentor
@vishnu rao

follow the links in this page for tuning, known issues and troubleshooting Link