Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Problem during installation receiving heartbeat from agent

avatar
Explorer

Hello all,
I ran into an issue when installing the CDH cluster. I'm trying to install on a single machine. I selected the machine, parcels and am trying to do it as root. At the end of the installation I receiving the following error messages:
"Installation failed. Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accessible on the Cloudera Manager server (check firewall rules)."

I've found this error in some other forums and attempted to make the changes suggested in those places.

These are the contents of my hosts file (I've also tried it with the localhosts line commented out).
127.0.0.1 localhost
192.168.xx.xx FQDN
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

This is my netstat for the port, I assume it's just Cloudera Manager listening. I did specifically open the port on the firewall.
netstat -ntlp | grep :7182
tcp 0 0 0.0.0.0:7182 0.0.0.0:* LISTEN 1861/java

Here are the contents of cloudera-scm-agent.out:
[29/Aug/2013 10:24:31 +0000] 2867 MainThread agent INFO SCM Agent Version: 4.6.3
[29/Aug/2013 10:24:31 +0000] 2867 MainThread agent ERROR Could not determine hostname or ip address; proceeding.
Traceback (most recent call last):
File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1573, in parse_arguments
ip_address = socket.gethostbyname(fqdn)
gaierror: [Errno -5] No address associated with hostname
usage: agent.py [-h] [--agent_dir AGENT_DIR]
[--agent_httpd_port AGENT_HTTPD_PORT] --package_dir
PACKAGE_DIR [--parcel_dir PARCEL_DIR]
[--supervisord_path SUPERVISORD_PATH]
[--supervisord_httpd_port SUPERVISORD_HTTPD_PORT]
[--standalone STANDALONE] [--master MASTER]
[--environment ENVIRONMENT] [--host_id HOST_ID]
[--disable_supervisord_events] --hostname HOSTNAME
--ip_address IP_ADDRESS [--use_tls]
[--client_key_file CLIENT_KEY_FILE]
[--client_cert_file CLIENT_CERT_FILE]
[--verify_cert_file VERIFY_CERT_FILE]
[--client_keypw_file CLIENT_KEYPW_FILE] [--logfile LOGFILE]
[--logdir LOGDIR] [--optional_token] [--clear_agent_dir]
agent.py: error: argument --hostname is required
[29/Aug/2013 10:24:31 +0000] 2867 Dummy-1 agent INFO Stopping agent...

I also have the cloudera-scm-agent.log if that would be helpful. The only errors that I see in there are:
[28/Aug/2013 11:58:13 +0000] 25640 MainThread agent ERROR Failed to connect to newly launched supervisor. Agent will exit
.
.
.
[28/Aug/2013 11:52:44 +0000] 24811 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 4)
[28/Aug/2013 11:52:44 +0000] 24811 MainThread agent ERROR Failed! trying again in 1 second(s)

Any help would be much appreciated.

Thank you,
Meredith

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hey Meredith,

 

The cloudera-scm-agent can't determine what this machine specifically believes its fully-qualified domain name is. The hostname -f queries that, as does the python blurb. Let's resolve that:

 

Assuming this is a centos/rhel machine, what do you have set in /etc/sysconfig/network for the fully-qualified domain name? It'd also be useful to add in /etc/hosts with the format

 

IP     FQDN    shortname

 

Your case has only IP and FQDN. Just ensure the shortname comes last for any entries that have them.

 

If not centos/rhel, ensure that you set hostname accordingly for your OS, then run those three commands again. Once those all return what's expected (the latter will return both FQDN and IP) we should be in a great place for you to try starting the agent again.

View solution in original post

7 REPLIES 7

avatar
Super Collaborator
Hi Meredith,

Please gather the outputs of these directly from the commandline on the node in question and paste the results back here so we may compare it to your current hosts file:

$ hostname
$ hostname -f
$ python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'

Thanks,
--

avatar
Explorer

root@Accumulo3:/var/log/cloudera-scm-agent# hostname
Accumulo3
root@Accumulo3:/var/log/cloudera-scm-agent# hostname -f
hostname: Name or service not known
root@Accumulo3:/var/log/cloudera-scm-agent# python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'
Accumulo3
Traceback (most recent call last):
File "<string>", line 1, in <module>
socket.gaierror: [Errno -5] No address associated with hostname

avatar
Explorer

Accumulo3.local is what I currently have in the hosts file as the FQDN since it's not a networked machine.  I have tried it with just Accumulo3 as well.

avatar
Super Collaborator

Hey Meredith,

 

The cloudera-scm-agent can't determine what this machine specifically believes its fully-qualified domain name is. The hostname -f queries that, as does the python blurb. Let's resolve that:

 

Assuming this is a centos/rhel machine, what do you have set in /etc/sysconfig/network for the fully-qualified domain name? It'd also be useful to add in /etc/hosts with the format

 

IP     FQDN    shortname

 

Your case has only IP and FQDN. Just ensure the shortname comes last for any entries that have them.

 

If not centos/rhel, ensure that you set hostname accordingly for your OS, then run those three commands again. Once those all return what's expected (the latter will return both FQDN and IP) we should be in a great place for you to try starting the agent again.

avatar
Explorer

Thank you for all of your help!  I'm on Ubuntu, so no /etc/sysconfig/network file.  However, I did add "IP     FQDN    shortname" into the hosts file which seemed to solve the hostname -f issue.  I then ran the 3rd command successfully.  Then when I went back to the GUI to setup the hosts, it said that the host already existed on the machine, so I just added the parcels to that machine.  I am now to the point where everything appears to be installed and I'm configuring/fixing the health.  Thank you so much for your help!

 

-Meredith

avatar
Contributor

Excelente, despues de buscar un dia entero la solución esta es la correcta

avatar
Super Collaborator
Que bueno que encontraste este thread entonces!