This is a brand new install on HDP 2.5 using Ambari 22.214.171.124-22 on CentOS 7. Currently all my nodes are online, however only half of my metrics are showing up (see attachment) capture.png
I have a feeling it has something to do with my network setup. Currently I have all of the machines (name/secondary name / data nodes) all on their own private network (172.x.x.x) using their own set of switches (Nexus 3k's) using a fiber card on the physical machines. On the second nic's (regular ethernet) I have them all attached to our regular network (192.x.x.x) so Ambari (running as a VM) can reach and orchestrate them and so they can reach the internet (I'm not using a local repo). My secondary name node is the metrics collector and on that machine I see all of the data nodes establishing a connection on port 6188 when I run "netstat -anp | grep 6188". I've looked at the /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out file for any errors and I can't find any. Typically in the past I would see "Connection refused" when the monitor couldn't connect to the collector.
Just to give you a better idea:
Ambari - 192.x.x.x (main network)
HDP machines - 172.x.x.x (Nexus / fiber network) / 192.x.x.x (main network)
NOTE: Grafana Dashboard is showing all possible metrics from what I see. But this dashboard is hosted within the 172.x.x.x network on the secondary name node.
With this network setup, would I need to set the "
-Dhttp.proxyPort=<yourProxyPort>" parameters? I can't see why I would need to; I'm seeing this as a common solution to my symptoms. However, all of the machines can access the internet as is using their secondary nic that is attached to our main network. But maybe somehow this network separation (with Ambari not on the fiber network) is causing the metrics data to not return to Ambari? On Ambari, the IP Address that it has all of the nodes as are the private IP's (Attachment 2) capture2.png . Perhaps this is a hint to the problem, though that's what it should be set to.
All of the machines have their private IP's set up in /etc/hosts so they only see each other on the 172.x.x.x network (minus Ambari, which is using regular DNS mapped to the 192.x.x.x IP's.)
I've ensured SELinux is disabled on all of the machines, as well as the firewall daemon. NTP servers are configured and synced. I've restarted the Ambari Metrics collector / services multiple times. I'm able to telnet from the data nodes to the collector port.