Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

heartbeat lost for all the services (HDP 2.4)

Highlighted

heartbeat lost for all the services (HDP 2.4)

Expert Contributor

I have a four node cluster (HDP 2.4). Ambari dashboard was showing hearbeat lost for all the services for one of the nodes (Machine1). On investigation, I found that Machine1 was shut down due to some reason. Following are the steps that I have tried. Please note that the ambari-server is installedon Machine2.

1) Restarted the node, and restarted ambari agent several times

2) Restarted ambari metrics collector and zookeeper server manually on this node, but there is no change. What should I try next?

3) Restarted ambari-server, ambari-agent on all the nodes.

Following is the log from the ambari-alerts.log, but I couldn't understand why this error is coming, because I do not think I have start Ambari Monitor or other services manually before starting ambari-agent.

INFO 2016-04-29 14:36:51,936 logger.py:67 - Pid file /var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid is empty or does not exist ERROR 2016-04-29 14:36:51,937 script_alert.py:112 - [Alert][ams_metrics_monitor_process] Failed with result CRITICAL: ['Ambari Monitor is NOT running on Machine1']

8 REPLIES 8
Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

Explorer

@Pradeep kumar

Do you have smartsense running ?

Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

Expert Contributor

@Ranjana Soundararajan I didn't install smartsense as one of the service. It was asking for Customer ID as registered with Hortonworks, which I do not have.

Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

@Ranjana Soundararajan, I think the original agent is in a strange state. Please try this:

  • Assuming Machine 1 is Linux based, do a "ps -ef ambari" on machine1. Kill -9 the ambari services.
  • yum remove ambari-agent
  • yum install ambari-agent
  • ambari-agent start

Hope this helps,

Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

Expert Contributor

@Ranjana Soundararajan I tried these steps several times, and also rebooted the node, but there is no change to the status of this Node in Ambari Dashboard. Therefore, I tried to delete the Node from the cluster and tried to add the node back to the cluster. This time, I started getting host name related issue for which I have opened another thread. Thanks for your input Ranjana

Re: heartbeat lost for all the services (HDP 2.4)

only restart of ambari-agent did the work for me.

thanks.

Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

Hi @Pradeep kumar

Can you access the Ambari UI on port 8080? If you can, try to stop all services and start all services. Services rely to start in a certain order, so using "stop all" and "start all" in Ambari could remediate your problem.

Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

Hello! It looks like the IP address from the agents have change. Could you please verify the IP shown in the "Hosts" tab, clicking any missing node, then looking down-left under ''Summary" check the IP known by Ambari Server is the same the lost host have currently. If they are different, well... you should put back the one known by Ambari Server.

Highlighted

Re: heartbeat lost for all the services (HDP 2.4)

@ Pradeep Kumar,

Please check this file file is available on path /var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid.

If the file is there then delete the file and kill the process id which is exist inside this file.

check process is running or not if running then also kill the process.

And try to take clean restart of ambari. first stop agent then restart ambar-server and then start again ambari-agent.

After did all the steps try to restart the service. This was worked for me, I faces similar issue before someday ago.

Don't have an account?
Coming from Hortonworks? Activate your account here