Support Questions
Find answers, ask questions, and share your expertise

Below is alert message is popping up and went back automatically after some time or may be one day.

Contributor

The variance for this alert is 36MB which is 21% of the 169MB average (34MB is the limit).

Is these a critical alert. Can anyone suggest the solution of above alert or any link to tunne the above alert.

3 REPLIES 3

Explorer

This particular alert is basically a way of detecting abnormalities from average. It goes away when the variance is within the acceptable range from the average (when the average goes up, so too does the limit).

In short, I do not believe this to be a critical alert.

You may change the threshhold by finding it in alerts.json for the service in question (for example, is this the HDFS NameNode heap alert?) and changing the threshhold for Warning, Critical etc.

Super Collaborator

I agree - I don't think it seems like a problem. These types of alerts typically happen when you either:

  • Deploy a new cluster
  • Suddenly push a ton of data through HDFS
  • Have not turned the alert to your cluster's normal state

Super Collaborator

This alert (and the others for HDFS like it) are attempting to tell you if there are anomalous heap memory readings in your cluster. The thresholds are used against the average of all heap memory readings to determine what an "appropriate standard deviation" for your cluster is. The numbers are telling you this:

The average HDFS heap in your cluster for a given period of time is 169MB

The first threshold is 20% of that average, which is 34MB. Anything over 34MB could indicate a problem.

The standard deviation in your cluster is at 36MB - it's pretty close and probably nothing to worry about.

However, if the value was much higher than 34MB, that could indicate a "spike" problem where your heap spikes (but stays within the max heap limit). Compared to all of the little memory values, these spikes would cause a large variance and should be investigated.