- last edited on
Context: We need to find a way to expose via AMS if any service's is down (e.g. NiFi, namenode, datanode, resourcemanager, etc. instance) in order for grafana to generate alerts and be notified.
Looking into all available AMS metrics (searching for "heartbeat") we could only find some service-specfic metrics like "dfs.FSNamesystem.ExpiredHeartbeats".
Is there anyway to query AMS for missed heartbeats for every server?
Thanks in advance,