Support Questions
Find answers, ask questions, and share your expertise

all services on ambari agent host are reachable but heartbeat is not available

Contributor

Hi,

I have weird issue. One of my ambari agent is not reporting heartbeat. The service is up and running. I can access all services running on that host. This happened after /var ans / were 100% full. I have cleared the space and restarted both agent as well as server. But still there is no heartbeat.

19 Jun 2017 10:58:21,428 ERROR [ambari-client-thread-28] MetricsRequestHelper:114 - Error getting timeline metrics : No route to host 19 Jun 2017 10:58:36,743 ERROR [ambari-client-thread-161] MetricsRequestHelper:114 - Error getting timeline metrics : No route to host 19 Jun 2017 10:58:49,615 WARN [qtp-ambari-agent-171] HeartBeatHandler:235 - Host is in HEARTBEAT_LOST state - sending register command 19 Jun 2017 10:58:52,055 ERROR [ambari-client-thread-150] MetricsRequestHelper:114 - Error getting timeline metrics : No route to host 19 Jun 2017 10:58:52,334 INFO [qtp-ambari-agent-171] HeartBeatHandler:425 - agentOsType = centos6 19 Jun 2017 10:58:52,435 INFO [qtp-ambari-agent-171] HostImpl:294 - Received host registration, host=[hostname=host1,fqdn=host1.hdp.com,domain=hdp.com,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=centos,osversion=6.9,osfamily=redhat,memory=16218212,uptime_hours=289,mounts=(available=6437488,mountpoint=/,used=42416140,percent=87%,size=51475068,device=/dev/sda5,type=ext4)(available=8104056,mountpoint=/dev/shm,used=5048,percent=1%,size=8109104,device=tmpfs,type=tmpfs)(available=95935916,mountpoint=/home,used=1902452,percent=2%,size=103081248,device=/dev/sda2,type=ext4)(available=16679096,mountpoint=/opt,used=32174532,percent=66%,size=51475068,device=/dev/sda6,type=ext4)(available=25360848,mountpoint=/var,used=72477520,percent=75%,size=103081248,device=/dev/sda3,type=ext4)] , registrationTime=1497850132334, agentVersion=2.4.1.0 19 Jun 2017 10:58:52,435 INFO [qtp-ambari-agent-171] TopologyManager:408 - TopologyManager.onHostRegistered: Entering 19 Jun 2017 10:58:52,436 INFO [qtp-ambari-agent-171] TopologyManager:410 - TopologyManager.onHostRegistered: host = host1.hdp.com is already associated with the cluster or is currently being processed 19 Jun 2017 10:58:52,485 INFO [qtp-ambari-agent-171] HeartBeatHandler:504 - Recovery configuration set to RecoveryConfig{, type=AUTO_START, maxCount=6, windowInMinutes=60, retryGap=5, maxLifetimeCount=1024, components=, recoveryTimestamp=1497850132483}

19 Jun 2017 10:59:07,423 ERROR [ambari-client-thread-150] MetricsRequestHelper:114 - Error getting timeline metrics : No route to host 19 Jun 2017 10:59:27,750 ERROR [ambari-client-thread-28] MetricsRequestHelper:114 - Error getting timeline metrics : connect timed out 19 Jun 2017 10:59:27,750 ERROR [ambari-client-thread-28] MetricsRequestHelper:121 - Error getting timeline metrics : connect timed out Cannot connect to collector: SocketTimeoutException.

4 REPLIES 4

Re: all services on ambari agent host are reachable but heartbeat is not available

Super Mentor

@Rajesh Reddy

- Have you already tried restarting ambari-agent?

- Did you notice any ERROR/WARNING message in the following agent log files?

/var/log/ambari-agent/ambari-agent.out 
/var/log/ambari-agent/ambari-agent.log 

.

From Ambari Agent host are you able to access the following ports of ambari server?

# telnet  $AMBARI_HOSTNAME   8440
# telnet  $AMBARI_HOSTNAME   8441

.

In the ambari agent configuration file "/etc/ambari-agent/conf/ambari-agent.ini" do you see the correct hostname of ambari server?

Example: (Pleas check if the following entry is having correct ambari server FQDN)

[server]
hostname = erie1.example.com
url_port = 8440
secured_url_port = 8441

.

Your ambari agent hostname is returning the correct FQDN when you are running the following command on the Agent host?

# hostname -f

.

Are you able to access ambari SSL ports as following from Agent Machine?

# openssl s_client -connect $AMBARI_HOSTNAME:8440
# openssl s_client -connect $AMBARI_HOSTNAME:8441

.

Re: all services on ambari agent host are reachable but heartbeat is not available

Contributor

Hi @Jay SenSharma

No errors were observed. The only error was about nodemanager health (url not accessible).

Yes, I can telnet to the ambari server.

[root@host1 ~]# hostname -f

host1.hdp.com

>>>>>>/etc/ambari-agent/conf/ambari-agent.ini

[server] hostname = server.hdp.com

url_port = 8440

secured_url_port = 8441

[root@host1 ~]# openssl s_client -connect 192.168.1.9:8441

CONNECTED(00000003)

[root@host1 ~]# openssl s_client -connect 192.168.1.9:8440

CONNECTED(00000003)

Re: all services on ambari agent host are reachable but heartbeat is not available

New Contributor

I am also facing the same issue, did you manage to resolve it?

Re: all services on ambari agent host are reachable but heartbeat is not available

Community Manager

@KPG1 as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.


Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community: