Support Questions

Find answers, ask questions, and share your expertise

Installation failed. Failed to receive heartbeat from agent.

avatar
Explorer

I am trying to install Cloudrea Manager Standard edition and CHD4 with parcels. This is being installed on three Dell machines running Ubuntu 12.04.2 LTS 64 bit. I am receiving an error on all three machines:

    Ensure that the host's hostname is configured properly.

    Ensure that port 7182 is accessible on the Cloudera Manager server (check firewall rules).

    Ensure that ports 9000 and 9001 are free on the host being added.

    Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).

 

I checked that the hostname is configured.

 

I checked that Ports 7182, 9000 and 9001 are free (I am guessing that Cloudera is using 9000 & 9001 for python because these ports are in use after the install fails but not before the install).

sudo netstat -tulpn

Active Internet connections (only servers)

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name

tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      1104/X

tcp        0      0 127.0.0.1:53            0.0.0.0:*               LISTEN      1123/dnsmasq

tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      655/sshd

tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      792/cupsd

tcp        0      0 127.0.0.1:6010          0.0.0.0:*               LISTEN      29403/0

tcp6       0      0 :::6000                 :::*                    LISTEN      1104/X

tcp6       0      0 :::22                   :::*                    LISTEN      655/sshd

tcp6       0      0 ::1:631                 :::*                    LISTEN      792/cupsd

tcp6       0      0 ::1:6010                :::*                    LISTEN      29403/0

udp        0      0 127.0.0.1:53            0.0.0.0:*                           1123/dnsmasq

udp        0      0 0.0.0.0:68              0.0.0.0:*                           1073/dhclient

udp        0      0 0.0.0.0:36051           0.0.0.0:*                           790/avahi-daemon: r

udp        0      0 0.0.0.0:5353            0.0.0.0:*                           790/avahi-daemon: r

udp6       0      0 :::50807                :::*                                790/avahi-daemon: r

udp6       0      0 :::5353                 :::*                                790/avahi-daemon: r

 

And I checked the firewalls and they are open

sudo iptables -L

Chain INPUT (policy ACCEPT)

target     prot opt source               destination

 

Chain FORWARD (policy ACCEPT)

target     prot opt source               destination

 

Chain OUTPUT (policy ACCEPT)

target     prot opt source               destination

 

Lastly I checked the log files in /var/log/cloudera-scm-agent/ and they do not show any errors  and there was only one warning, that the default socket timeout was set to 30.

 

Can anyone point me to what the possible problem is? We are looking at using Hadoop in one of our solutions and are trying to evaluate it before purchasing the Enterprise version. I cannot use a cloud version because of data restrictions put on my by the data vendor and client so I need to have an internal sandbox to get an idea of what we need to develop and what we will need to support. Thanks!

 

2 ACCEPTED SOLUTIONS

avatar
Guru

@enelso:  Your hostname needs to be tied to an actual IP address on your local network which can send/receive traffic between all the hosts.  The address you have associated your hostname with is the loopback address, which cannot route actual network traffic off the host.

 

Use "ifconfig -a" to see a listing of your network interfaces and choose one that has an actual IP address. 

View solution in original post

avatar
Guru

Hello,

 

  I apologize for the delay.  I think there may be a couple of things going on.  For starters, you should add your own hostname and IP address to the /etc/hosts file on each machine.  In other words, both Host1 and Host2 entries should be in the /etc/hosts file on both machines.  Also, have you checked to see if iptables is running?  That is a firewall app that can stop traffic between nodes.  To identify if iptables is running and disable it, do this (as root):

 

$ sudo chkconfig iptables --list
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

 

If you see iptables as "on" for any of those values (especially #5), than it's probably getting in your way, you should disable it unless you've got a company policy requiring it be enabled.  To disable it, run this command:

 

sudo chkconfig iptables off

 

View solution in original post

30 REPLIES 30

avatar
Explorer

I also have this issue - 'Failed to receive heartbeat from agent.' I use Cloudera Manager Portal to add one more node to the cluster and use package installation and assign roles on this node. This issue happened in the phrase when the Cloudera Manager Agent has started.

 

What happend is :

1. the cloudera manager agent has started, the cloudera-manager-server can detect the new host

2. the heartbeat can't be reached by cloudera-scm-agent

 

Can you explain a little bit details about the how the heartbeat happen betwwen Cloudera Manager Server and Cloudera Manager Agent? 

How can the cloudera-scm-manager send the command like 'yum install' 'yum list' on the new node? And what port does cloudear-scm-server receive heartbeat from cloudera-scm-agent? What does cloudera-scm-agent do on the port 9000? Who will start supervisord and how does it start it? 

 

 

 

 

 

 

avatar
Expert Contributor
Clint, my hosts file look like this, and seemed it is not working.
52.60.239.41 ip-172-31-12-19.ca-central-1.compute.internal
52.60.157.176 ip-172-31-3-218.ca-central-1.compute.internal
35.182.18.189 ip-172-31-3-179.ca-central-1.compute.internal
35.182.25.52 ip-172-31-2-137.ca-central-1.compute.internal
do I have to give a short name at end, where this shortname come from my EC2 instance name, right?

avatar
New Contributor

Hi Clint,

 

Can you please guide me throw command since I'm really new to centos, and I'm facing the same issue while I am installing my cluster with dockers.

 

 

snapshot2.png

 

avatar

Hi,

 

Encountered same issue installing CM6 agent in all my host.

Same error:

>>[13/Dec/2018 11:11:08 +0000] 3831 MainThread agent ERROR Heartbeating to cl-cmu.cloudera.de:7182 failed. 
>>Traceback (most recent call last): 
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1362, in _send_heartbeat 
>> self.cfg.max_cert_depth) 
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/https.py", line 139, in __init__

 

I have check my host file, and everthing looks good.

Even try to telnet from host to server also can get through...

Might be a bug in the installation.

avatar

 

Also, maybe there is some typo errors in the installation script? 

avatar
sorry my bad

avatar
Expert Contributor

Hi Dennis,

 

This thread is pretty old and marked solved. It might be a good idea to start a new thread instead of adding more to this one.

 

The stack trace you provided appears to be a partial one, though I could be wrong. The stack trace we do have appears to point to a problem with TLS. 

 

Have you configured TLS already in CM or on the Agent?

Are you attempting to use AutoTLS? If you are have you properly setup all of the host?

---
Customer Operations Engineer | Security SME | Cloudera, Inc.

avatar

Hi lhebert,

 

Im quite confuse with the auto-tls mechanism in cloudera manager setup.

Do we run it in all host or in just our CM?

 

P/s: i have started a new post. Pls do closed this thread. TQ

 

avatar

avatar
New Contributor

Can you more elaborate on "Hostname tied to the actual IP address?" and "Use "ifconfig -a" to see a listing of your network interfaces and choose one that has an actual IP address."

 

How do I know which hostname and ip address to use. Because while installing single node cluster of cloudera how would you know which hosts to specify?

 

Thank you!