We recently decommissioned one of our DataNodes. Afterwords, Ambari has been complaining about the "Percent DataNodes Available" Alert because it is still counting the decommissioned DataNode.
The "DataNode Process" alert is OK for all of the remaining DataNodes.
HDFS shows that all of the DataNodes are Started and Live.
How can we get Ambari to stop complaining about the DataNode that no longer exists?
The aggregate alerts take into account the status of individual instance alerts. The reason that the "Percent DataNodes Available" alert is tripping is because your decommissioned DataNode is being "managed" by Ambari. You either need to remove the DataNode or put that specific instance into Maintenance Mode.
We decommissioned, stopped, and deleted the DataNode component using Ambari. I don't know if Ambari is confused because that Host still exists to run other services, but Ambari is still counting it in the aggregate alert even though that Host no longer has a DataNode component. It may be related to this bug: https://issues.apache.org/jira/browse/AMBARI-22581
However, I don't know how to work around it, other than ignoring/disabling the alert.
We stopped all the services and rebooted every host in our cluster in order to do maintenance for another reason and surprisingly, Ambari stopped complaining. The aggregate alert shows the correct total count now.