Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Half of Ambari Metrics are not displaying on dashboard, but all metrics seem to be collected.

avatar
Contributor

Hello All,

This is a brand new install on HDP 2.5 using Ambari 2.4.1.0-22 on CentOS 7. Currently all my nodes are online, however only half of my metrics are showing up (see attachment) capture.png

I have a feeling it has something to do with my network setup. Currently I have all of the machines (name/secondary name / data nodes) all on their own private network (172.x.x.x) using their own set of switches (Nexus 3k's) using a fiber card on the physical machines. On the second nic's (regular ethernet) I have them all attached to our regular network (192.x.x.x) so Ambari (running as a VM) can reach and orchestrate them and so they can reach the internet (I'm not using a local repo). My secondary name node is the metrics collector and on that machine I see all of the data nodes establishing a connection on port 6188 when I run "netstat -anp | grep 6188". I've looked at the /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out file for any errors and I can't find any. Typically in the past I would see "Connection refused" when the monitor couldn't connect to the collector.

Just to give you a better idea:

Ambari - 192.x.x.x (main network) HDP machines - 172.x.x.x (Nexus / fiber network) / 192.x.x.x (main network)

NOTE: Grafana Dashboard is showing all possible metrics from what I see. But this dashboard is hosted within the 172.x.x.x network on the secondary name node.

With this network setup, would I need to set the "-Dhttp.proxyHost=<yourProxyHost> -Dhttp.proxyPort=<yourProxyPort>" parameters? I can't see why I would need to; I'm seeing this as a common solution to my symptoms. However, all of the machines can access the internet as is using their secondary nic that is attached to our main network. But maybe somehow this network separation (with Ambari not on the fiber network) is causing the metrics data to not return to Ambari? On Ambari, the IP Address that it has all of the nodes as are the private IP's (Attachment 2) capture2.png . Perhaps this is a hint to the problem, though that's what it should be set to.

All of the machines have their private IP's set up in /etc/hosts so they only see each other on the 172.x.x.x network (minus Ambari, which is using regular DNS mapped to the 192.x.x.x IP's.)

I've ensured SELinux is disabled on all of the machines, as well as the firewall daemon. NTP servers are configured and synced. I've restarted the Ambari Metrics collector / services multiple times. I'm able to telnet from the data nodes to the collector port.

Thanks

1 ACCEPTED SOLUTION

avatar
Contributor

I figured this out. It is because of my network layout. The HDP machines are all binding to the 172 addresses and listening on that. When Ambari goes to connect to the name node on port 50070 for instance, it can't because the name node is listening on 50070 on 172 network which Ambari isn't on.

I resolved the issue by setting up multi-homing: https://community.hortonworks.com/articles/24277/parameters-for-multi-homing.html

View solution in original post

1 REPLY 1

avatar
Contributor

I figured this out. It is because of my network layout. The HDP machines are all binding to the 172 addresses and listening on that. When Ambari goes to connect to the name node on port 50070 for instance, it can't because the name node is listening on 50070 on 172 network which Ambari isn't on.

I resolved the issue by setting up multi-homing: https://community.hortonworks.com/articles/24277/parameters-for-multi-homing.html