Support Questions

Find answers, ask questions, and share your expertise

Garafana faild to start

Explorer

Dears,

I need your help in fixing Grafana relevant issue: when I start services using Ambari, all start successfully except Grafana. And after it fails to start, all other services in the same host fail also. While I was troubleshooting, I found that :

1) Process id in "grafana-server.pid" is not matching the actual Id which is shown from "ps aux | grep grafana". After changing it manaully in grafana-server.pid, it changes back to mismatched PID after I restart Grafana.

2) in "/var/log/ambari-metrics-grafana/log", it says that "web.go:93 StartServer()] [E] Fail to start server: listen tcp 0.0.0.0:3000: bind: address already in use"

Please your kind help?

grafana-error-1377.txt

grafana-error-output-1377.txt

grafana-log.txt

1 ACCEPTED SOLUTION

Explorer

Dears,

Just to update:

I found the solution in https://community.hortonworks.com/articles/18088/ambari-shows-hdp-services-to-be-down-whereas-they.h...

"One of the issue could be due to /var/lib/ambari-agent/data/structured-out-status.json. Cat this file to review the content. Typical content could be like following:

cat structured-out-status.json {"processes": [], "securityState": "UNKNOWN"}

Compare the content with the same file in another node which is working fine.

Stop ambari-agent, move this file to another file and restart ambari-agent."

After that I restarted ambari-agent on the relevant host and things working back properly.

View solution in original post

11 REPLIES 11

@saif

make sure there is no other service running on "3000" port before you attempt to start Grafana.

just check "netstat -anlp | grep 3000" and then check which process is running on that and then kill if it is not needed.

Explorer

@amarnathreddy pappu

I followed what you recommended, Grafana ran for sometime and goes down with all services in the same host.

Whenever I kill the process before restarting all services in the host where Grafana is installed, all services works for few seconds and all go down immediately.

Why should I kill the process ": : : 3000" every time and no way to prevent it from running at all?

Why all services go down only in this specific host?

Please help?

@Saif

You don't have to kill every time. what is the error in grafana logs? earlier you mentioned it was not even starting. now you can check the logs for new error messages.

Explorer

@amarnathreddy pappu

Thank you very much.

I think I have tried that, I will try that again and inform you.

so what should I do if another service take that port, just killing that service?

The proper way to check if another service running that port, in Ubuntu?

Explorer

@amarnathreddy pappu

Thank you very much.

I run that command but it seems that no proccess is using that port.

Could you please check the attached screenshot to see if I am right? if yes, what the issue then

netstat-anlp-3000.png

Expert Contributor
@Saif

This issue has been fixed in https://issues.apache.org/jira/browse/AMBARI-19054. Can you upgrade to Ambari 2.5.0? Or you can manually update the /usr/sbin/ambari-metrics-grafana file with the one from Ambari 2.5.0.

Explorer

@Aravindan Vijayan

Thanks a lot for your response Aravindan, I spent days on this issue and just found now that:

1) When I kill the process with ": : :3000" and restart services, they run and then go down from ambari dashboard only.

2) When I checked some services using urls with specific ports, I found services are running .

Which one is not risky, upgrading HDP or manually updating Grafana, I spent long in building the cluster and I do not want to come up with new unexpected issues?

If you have best and clear reference to follow in upgrading , it will be highly appreciated?

Explorer

@Aravindan Vijayan

But why this issue is not showing in my another testing HDP 2.4 cluster which start running few months back?!

Explorer

Dears,

Just to update:

I found the solution in https://community.hortonworks.com/articles/18088/ambari-shows-hdp-services-to-be-down-whereas-they.h...

"One of the issue could be due to /var/lib/ambari-agent/data/structured-out-status.json. Cat this file to review the content. Typical content could be like following:

cat structured-out-status.json {"processes": [], "securityState": "UNKNOWN"}

Compare the content with the same file in another node which is working fine.

Stop ambari-agent, move this file to another file and restart ambari-agent."

After that I restarted ambari-agent on the relevant host and things working back properly.

Expert Contributor

While grafana service is stopped on Ambari. I find grafana process with.

ps aux | grep grafana

Then kill the process.

Started Grafana from Ambari.