Created 09-09-2016 10:18 AM
Hi Community,
We have seen an unusual behavior on our Hortonworks cluster, where on one of the Host Ambari shows that all the services are down, but in real the services are running properly, this happened even after running the services from Ambari.
Following has been checked.
1) Verified the status of services from logs e.g. /var/hadoop/hdfs/hadoop-namenode .... (Shows running)
2) Checked the PID of services from /var/run/<servicename>/<servicename.pid> (The new PID was present).
3) Ambari agents and ambari server was stopped and started as well, but it didn't help.
Is there any way to fix this issue? If you need more information, please do let me know.
HDP Version: 2.3.4
Ambari Version: 2.1.1
Thanks in Advance.
Cheers !
Hammad
Created 09-14-2016 07:23 AM
Hello All,
Thanks all for your help, I managed to resolve the issue by doing the following:
1) I realized that in the ambari-agent logs I have this error message
{'msg':'Unable to read structured output from /var/lib/ambari-agent/data/structured-out-status.json'}
2) For resolving this first stop ambari agent
3) move the move /var/lib/ambari-agent/data/structured-out-status.json to /tmp.
4) Restart the ambari-agent.
5) Everything is green again.
Actually I followed a post mentioned here: https://community.hortonworks.com/questions/16953/unable-to-start-hdp-serrvices-from-ambari.html
Don't know the exact reason behind this, but i was able to fix the issue.
Thanks again for your help.
Created 09-09-2016 11:52 AM
I've only ever seen this happen when the ambari agent process terminates on a host. In 3) you mention stopping and starting the agent. What's that status of the agent when ambari is showing the wrong state? What's does the ambari agent log file (/var/log/ambari-agent/ambari-agent.log) say ?
[root@sandbox ~]# ambari-agent status Found ambari-agent PID: 2877 ambari-agent running. Agent PID at: /var/run/ambari-agent/ambari-agent.pid Agent out at: /var/log/ambari-agent/ambari-agent.out Agent log at: /var/log/ambari-agent/ambari-agent.log
Created 09-09-2016 01:04 PM
Also, this is a good related article:
Created 09-13-2016 03:47 PM
Can you please post few lines from "ambari-server.log" and the "ambari-agent.log (from the host where the NN is running" , Where it is either ERROR or WARN.?
Also starting ambari-agent.log in DEBUG mode will help in this case. It can be done by editing the file "/etc/ambari-agent/conf/ambari-agent.ini" changing loglevel from INFO to debug as following
loglevel = DEBUG
Created 09-13-2016 05:31 PM
This most definitely looks like an agent issue.
Can you check if
1. There are stale agent processes
2. The agent is up and running (And not shutting down after starting for some reason)
To confirm both, you can use :
ps -ef | grep "ambari_agent"
Created 09-14-2016 07:23 AM
Hello All,
Thanks all for your help, I managed to resolve the issue by doing the following:
1) I realized that in the ambari-agent logs I have this error message
{'msg':'Unable to read structured output from /var/lib/ambari-agent/data/structured-out-status.json'}
2) For resolving this first stop ambari agent
3) move the move /var/lib/ambari-agent/data/structured-out-status.json to /tmp.
4) Restart the ambari-agent.
5) Everything is green again.
Actually I followed a post mentioned here: https://community.hortonworks.com/questions/16953/unable-to-start-hdp-serrvices-from-ambari.html
Don't know the exact reason behind this, but i was able to fix the issue.
Thanks again for your help.