In my Ambari 2.3 cluster all the services are running fine but Ambari UI is still showing alerts, I restarted all the services many times but no use those alerts are still visible in UI. Can any one help me on this?

Sometimes even though a service is running, there could be alerts which are triggering for it. Some examples include a service which is technically up but has a 500 error on its web management page. Or perhaps a metric alert is firing because of a value which is outside of the range of acceptable values.

Can you please provide more context into which alerts are present?

In some older versions of Ambari, alerts would remain for hosts/service which were removed. So if you're using an older version it could be that as well.

Hi Jonathan,

Thanks for your reply, here below is the detailed analysis

am getting this following alerts

Connection failed to http://hostname:8042 (<urlopen error timed out>)

JournalNode Web UI

NodeManager Web UI

Oozie Server Web UI

History Server Web UI

Percent JournalNodes Available

App Timeline Web UI

WebHCat Server Status

Falcon Server Web UI

Metrics Collector - Auto-Restart Status

All the web UI are working fine but don't why Ambari is still showing alerts, even i did telnet with port numbers it showing as connected. This one is test cluster and has no firewall restrictions.

Let's take the first alert: Connection failed to http://hostname:8042 (<urlopen error timed out>)

Remember that this is being run from the specific host to itself. So, in order to verify that things work, you'd need to first login to that host, and then run wget from that host to the FQDN in the error above. Also, check your environment for any possible proxy settings, like "export http_proxy"

Hi Jonathan, could you please provide me steps what exactly i have to do. Using wget what should i do?

You should wget the address specified in the error; the http://hostname:8042 address.

The alerts run on their respective hosts. If the alert is for a NodeManager, then for every NodeManager in your system, each one will attempt to connect to it's own FQDN.

I got the below response after doing wget Jonathan.

Connecting to||:8042... connected. HTTP request sent, awaiting response... 302 Found Location: [following] --2017-10-05 19:26:12-- Reusing existing connection to HTTP request sent, awaiting response... 200 OK Length: 6184 (6.0K) [text/html] Saving to: “index.html” 100%[=============================================================>] 6,184 --.-K/s in 0s 2017-10-05 19:26:12 (512 MB/s) - “index.html” saved [6184/6184]

