Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Many Ambari "stale alerts" messages

avatar
Contributor

Hi all,

Last night I got many of the following Ambari critical alerts:

There are {x} stale alerts from {n} host(s): {components list}

where {x}, {n} and {components list} were not always the same. For example:

There are 20 stale alerts from 1 host(s): NameNode Web UI, Metrics Monitor Status, WebHCat Server Status, NameNode High Availability Health, HST Server Process, NameNode Last Checkpoint, Flume Agent Status, Oozie Server Status, ZooKeeper Failover Controller Process, HBase Master Process, ResourceManager Web UI, HDFS Upgrade Finalized State, Ambari Agent Disk Usage, NameNode Directory Status, DataNode Health Summary, Oozie Server Web UI, DRPC Server Process, NodeManager Health Summary, RegionServers Health Summary, HiveServer2 Process

After 6 minutes, Ambari sent an OK alerts:

All alerts have run within their time intervals.

These messages repeated over and over again (13 critical, then 13 OK in 5 hours). This is the first time I see so many alerts from our cluster in one single night and all the services are fine from Ambari this morning. No more alerts either.

Does anybody have any insight what might cause this?

Thank you very much in advance!

Xi Sanderson

1 ACCEPTED SOLUTION

avatar
Contributor

Hi all,

I opened a support ticket and got answer back regarding metastore alerts. It is a known bug in the Ambari release I have (2.1.2):

https://issues.apache.org/jira/browse/AMBARI-14424

The suggested solution is to change script:

/var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_metastore.py

search for 30 and replace with 120, then restart Ambari server.

Still yet to monitor how the changes work.

Thank for all the helps from you guys!

Xi

View solution in original post

13 REPLIES 13

avatar
Master Mentor
@Xi Sanderson

definitely open a support ticket and use smartsense to collect logs. Take a look in your /var/log/hive for metastore specific logs and paste errors from there here. Maybe we can help.

avatar
Contributor

Hi all,

I opened a support ticket and got answer back regarding metastore alerts. It is a known bug in the Ambari release I have (2.1.2):

https://issues.apache.org/jira/browse/AMBARI-14424

The suggested solution is to change script:

/var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_metastore.py

search for 30 and replace with 120, then restart Ambari server.

Still yet to monitor how the changes work.

Thank for all the helps from you guys!

Xi

avatar
Expert Contributor

@Xi Sanderson @Artem Ervits

Thanks for sharing this useful information. How can I download the patch from jira and install it rather than running manually.

This is the first time am installing the ambari patch 😞

avatar
Master Mentor

Only apply patches if necessary and instructed by support. In case you don't have a support contract, here's Pivotal instructions to patch Ambari, we don't provide steps due to the reasons above. http://hawq.docs.pivotal.io/docs-hawq/topics/hdp-prerequisites.html

Needless to say its at your own risk.