Created 04-24-2017 01:21 PM
Dears,
I need your help in fixing Grafana relevant issue: when I start services using Ambari, all start successfully except Grafana. And after it fails to start, all other services in the same host fail also. While I was troubleshooting, I found that :
1) Process id in "grafana-server.pid" is not matching the actual Id which is shown from "ps aux | grep grafana". After changing it manaully in grafana-server.pid, it changes back to mismatched PID after I restart Grafana.
2) in "/var/log/ambari-metrics-grafana/log", it says that "web.go:93 StartServer()] [E] Fail to start server: listen tcp 0.0.0.0:3000: bind: address already in use"
Please your kind help?
Created 04-28-2017 02:00 AM
Dears,
Just to update:
I found the solution in https://community.hortonworks.com/articles/18088/ambari-shows-hdp-services-to-be-down-whereas-they.h...
"One of the issue could be due to /var/lib/ambari-agent/data/structured-out-status.json. Cat this file to review the content. Typical content could be like following:
cat structured-out-status.json {"processes": [], "securityState": "UNKNOWN"}
Compare the content with the same file in another node which is working fine.
Stop ambari-agent, move this file to another file and restart ambari-agent."
After that I restarted ambari-agent on the relevant host and things working back properly.
Created 04-24-2017 06:00 PM
@saif
make sure there is no other service running on "3000" port before you attempt to start Grafana.
Created 04-24-2017 06:01 PM
just check "netstat -anlp | grep 3000" and then check which process is running on that and then kill if it is not needed.
Created 04-26-2017 03:01 PM
I followed what you recommended, Grafana ran for sometime and goes down with all services in the same host.
Whenever I kill the process before restarting all services in the host where Grafana is installed, all services works for few seconds and all go down immediately.
Why should I kill the process ": : : 3000" every time and no way to prevent it from running at all?
Why all services go down only in this specific host?
Please help?
Created 04-26-2017 05:00 PM
You don't have to kill every time. what is the error in grafana logs? earlier you mentioned it was not even starting. now you can check the logs for new error messages.
Created 04-24-2017 06:11 PM
Thank you very much.
I think I have tried that, I will try that again and inform you.
so what should I do if another service take that port, just killing that service?
The proper way to check if another service running that port, in Ubuntu?
Created 04-25-2017 06:31 PM
Thank you very much.
I run that command but it seems that no proccess is using that port.
Could you please check the attached screenshot to see if I am right? if yes, what the issue then
Created 04-26-2017 06:39 PM
This issue has been fixed in https://issues.apache.org/jira/browse/AMBARI-19054. Can you upgrade to Ambari 2.5.0? Or you can manually update the /usr/sbin/ambari-metrics-grafana file with the one from Ambari 2.5.0.
Created 04-26-2017 07:28 PM
Thanks a lot for your response Aravindan, I spent days on this issue and just found now that:
1) When I kill the process with ": : :3000" and restart services, they run and then go down from ambari dashboard only.
2) When I checked some services using urls with specific ports, I found services are running .
Which one is not risky, upgrading HDP or manually updating Grafana, I spent long in building the cluster and I do not want to come up with new unexpected issues?
If you have best and clear reference to follow in upgrading , it will be highly appreciated?
Created 04-26-2017 08:18 PM
But why this issue is not showing in my another testing HDP 2.4 cluster which start running few months back?!