Created 09-09-2016 10:18 AM
Hi Community,
We have seen an unusual behavior on our Hortonworks cluster, where on one of the Host Ambari shows that all the services are down, but in real the services are running properly, this happened even after running the services from Ambari.
Following has been checked.
1) Verified the status of services from logs e.g. /var/hadoop/hdfs/hadoop-namenode .... (Shows running)
2) Checked the PID of services from /var/run/<servicename>/<servicename.pid> (The new PID was present).
3) Ambari agents and ambari server was stopped and started as well, but it didn't help.
Is there any way to fix this issue? If you need more information, please do let me know.
HDP Version: 2.3.4
Ambari Version: 2.1.1
Thanks in Advance.
Cheers !
Hammad
Created 09-14-2016 07:23 AM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 09-09-2016 11:52 AM
I've only ever seen this happen when the ambari agent process terminates on a host. In 3) you mention stopping and starting the agent. What's that status of the agent when ambari is showing the wrong state? What's does the ambari agent log file (/var/log/ambari-agent/ambari-agent.log) say ?
[root@sandbox ~]# ambari-agent status Found ambari-agent PID: 2877 ambari-agent running. Agent PID at: /var/run/ambari-agent/ambari-agent.pid Agent out at: /var/log/ambari-agent/ambari-agent.out Agent log at: /var/log/ambari-agent/ambari-agent.log
Created 09-09-2016 01:04 PM
Also, this is a good related article:
Created 09-13-2016 03:47 PM
Can you please post few lines from "ambari-server.log" and the "ambari-agent.log (from the host where the NN is running" , Where it is either ERROR or WARN.?
Also starting ambari-agent.log in DEBUG mode will help in this case. It can be done by editing the file "/etc/ambari-agent/conf/ambari-agent.ini" changing loglevel from INFO to debug as following
loglevel = DEBUG
Created 09-13-2016 05:31 PM
This most definitely looks like an agent issue.
Can you check if
1. There are stale agent processes
2. The agent is up and running (And not shutting down after starting for some reason)
To confirm both, you can use :
ps -ef | grep "ambari_agent"
Created 09-14-2016 07:23 AM
Want to get a detailed solution you have to login/registered on the community
Register/Login