Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Lost HeartBeats Ambari

Highlighted

Re: Lost HeartBeats Ambari

@Arthur GREVIN

  1. ambari-agent stop
  2. Ensure that /var/run/ambari-agent/ambari-agent.pid doesn't exist
  3. ambari-agent start
  4. Check /var/run/ambari-agent/ambari-agent.pid and ensure that this is the process which is runing for agent

Does the ambari-agent log show that heartbeats are being generated and acknowledged? How many nodes are in the cluster? or is it that all those losing the heartbeat are in the same node?

Highlighted

Re: Lost HeartBeats Ambari

So I did the operations, but nothing changes.

In my ambari-agent log it shows WARNING 2016-03-16 10:18:00,938 NetUtil.py:105 - Server at https://dl-master:8440 is not reachable, sleeping for 10 seconds...

The cluster have 4 nodes and only loose heartbeat on one node.

Highlighted

Re: Lost HeartBeats Ambari

I still have the same problem, Is it possible that the problem is : Connecting to https://dl-master:8440/ca ERROR 2016-03-16 10:18:31,414 NetUtil.py:77 - [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590) I don't have any certificate with amabari-agent. But I didn't have any problem before.

Highlighted

Re: Lost HeartBeats Ambari

Super Collaborator

You're probably hitting this (fixed in Ambari 2.2): https://issues.apache.org/jira/browse/AMBARI-14149

Could it be that your version of Python changed on those hosts?

Highlighted

Re: Lost HeartBeats Ambari

Super Collaborator

You can also check /var/log/ambari-server/ambari-server.log to see if there are any errors being logged on that host's heartbeat. Sometimes if heartbeat processing encounters an error it will manifest like a lost heartbeat from the agent.

Highlighted

Re: Lost HeartBeats Ambari

@Arthur GREVIN

Might be possible the change in firewall or something, Have you configured this within firewall ? If so, then please check by disabling it, might be your server isn't reaching due to that.

Re: Lost HeartBeats Ambari

Contributor

@Arthur GREVIN

Can you please check and confirm whether any recent changes were made in the /etc/hosts file?

Highlighted

Re: Lost HeartBeats Ambari

New Contributor

I have the same problem, Is this problem resolved???

I have installed HDP 2.4 in ubuntu 14.04, Single Node Cluster, I have configured the cluster using the hostname not using IP address. Please find below the screenshot:

5807-bbymk.png

Hearbeat lost is the error for all the components.

Connection failed: [Errno 111] Connection refused to (hostname) :50095

Highlighted

Re: Lost HeartBeats Ambari

Sorry, I didn't see your answer sooner. I wasn't able to solve the issue and finally decide to reinstall ambari, that remove this issue. I didn't face the same kind of issue since then.

Highlighted

Re: Lost HeartBeats Ambari

@Prabhu Varadharajan

Please provide the logs, and check anything is running on 50095 port.

netstat -tlpn |grep 50095.

you will get the output with process id and check which process is running on the port 50095.

ps -ef |grep process_id

and kill the process if not important.

Don't have an account?
Coming from Hortonworks? Activate your account here