Ok I understand to restore ambari heartbeat you restart ambari server/ambari agent. So they can 'connect' with each other. Problem is Ambari restart takes too long. It does restart but in the order of 600 seconds (10 minutes) opposed to 50 seconds (default) as suggested https://community.hortonworks.com/questions/183299/waiting-for-ambari-server-to-start-error-exiting-... which seems very unhealthy to me, what could be the cause of this. By the time it has restarted i suspect ambari-agent has lost connection again. What could be the problem for ambari restart taking too long. Hence I am losing heartbeat on ambari as the ambari-agent and ambari server restart are NOT synchronous because of the extremely long interval of ambari server restart.
Can you please let us know how many nodes are theyr in your cluster?
Also can you please share your ambari-server.log from starting the ambari-server so that we can check at which point ambari server is taking longer time to come up.
Sometimes it happens if the Ambari Server has lot much entries inside the ambari DB (usually old ambari cluster has a lots of old operational logs) which causes slowness.
So please check if you can try purging old ambari db entries using the utility mentioned here: https://docs.hortonworks.com/HDPDocuments/Ambari-188.8.131.52/bk_ambari-administration/content/purging-am...
Similarly sometimes it happens due to less heap setting of ambari server. So please refer to the mentioned tuning article to see if your ambari server needs some tuning: