Recently I setup email notifications in Ambari (126.96.36.199) to receive notifications when alert states change. I've found that throughout each day the YARN alert App Timeline Web UI: Connection Failed switches from CRITICAL (connection timed out on port 8188) to OK. This happens 12 / 14 times a day. I'm unsure why this is? I have a basic 2 node cluster, the App Timeline and History Server components are on node 1, and the Resource Manager is on node 1 if that helps.
Any thoughts on why this might happen - it doesn't seem to effect performance.
Meanwhile can you check if you are able to get JMX value of the alert. Also try to get the alert defination using api -
Make a GET request to get the alert definition for the alert you want to change
You can try 2 options -
2.You can try modifying "connection_timeout" value for the timeline webui. Please follow steps below
a.Check alert defination using below command -
c. Copy the json into file name test.json
d. Modify the json value for "connection_timeout" : 5.0" to "connection_timeout" : 50.0" and save the file.
e. PUT it using below command -
curl -H 'X-Requested-By:ambari' -u $ambari_username:$ambari_password -X PUT --data @test.json http://<ambari_fqdn>:8080/api/v1/clusters/<cluster-name>/alert_definitions/<alert_no>/