Created on 11-19-2013 07:17 AM - last edited on 10-31-2017 09:12 AM by cjervis
Hi,
I am trying to install development instance of Hadoop on Microsoft Azure VM (A single node cluster). I am running Ubuntu 12.04.3 LTS Linux.
Everything is going well until the very last step in the installation process where I get the following -
Installation failed. Failed to receive heartbeat from agent.
I looked at the logs and see the following errors -
>>[19/Nov/2013 15:00:55 +0000] 1922 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process
>>[19/Nov/2013 15:00:55 +0000] 1922 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor
>>[19/Nov/2013 15:00:55 +0000] 1922 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include
>>[19/Nov/2013 15:00:55 +0000] 1922 MainThread agent INFO Connecting to previous supervisor: agent-1304-1384872987.
>>[19/Nov/2013 15:00:55 +0000] 1922 MainThread _cplogging INFO [19/Nov/2013:15:00:55] ENGINE Bus STARTING
>>[19/Nov/2013 15:00:55 +0000] 1922 MainThread _cplogging INFO [19/Nov/2013:15:00:55] ENGINE Started monitor thread '_TimeoutMonitor'.
>>[19/Nov/2013 15:00:55 +0000] 1922 HTTPServer Thread-2 _cplogging ERROR [19/Nov/2013:15:00:55] ENGINE Error in HTTP server: shutting down
>>Traceback (most recent call last):
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/process/servers.py", line 187, in _start_http_thread
>> self.httpserver.start()
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1825, in start
>> raise socket.error(msg)
>>error: No socket could be created on ('NexusHadoopVM', 9000) -- [Errno 99] Cannot assign requested address
>>
>>[19/Nov/2013 15:00:55 +0000] 19
I checked if anything is already using 9000 and 9001 via
lsof -i :9000
lsof -i :9001
as well as netstat and both came with nothing. In the Azure VM manager I specified that both 9001 and 9002 are open (private and public), not sure what else needs to be configured.
I also using public IP address when adding a node to the cluster.
Please help!!!
Created 03-09-2015 08:28 AM
Great job.
I try to keep the names as simple as possible so I can run thousands of scripts.
My hosts files is like:
127.0.0.1 localhost
Looping interfaces from multiple machines. AWS or for example Linode looping you just would use the internal looping device. Fast and quickly managed.
#Cloudera Machines
192.168.2.1 n1
192.168.2.2 n2
192.168.2.3 n3
" n4
" n5
" n6
" n7
and so on to make it easier for changes across machines.
Such as:
for i in {1..300}; do ssh n$i date; done <-- Checks dates on all machines to make sure each machine is sync'ed.
Makes life easier to make it simple.
Created 08-28-2015 01:08 AM
do you mean you need to use base.hadoopdomain twice in the /etc/hosts name?
I have got the some errors and changed by your hosts file, and still got the same error.
mine is looked like following, please let me know if I did wrong here.
127.0.0.1 localdomain localhost
54.186.89.67 ec2-54-186-89-67.us-west-2.compute.amazonaws.com slave3 ec2-54-186-89-67.us-west-2.compute.amazonaws.com
54.186.87.178 ec2-54-186-87-178.us-west-2.compute.amazonaws.com slave2 ec2-54-186-87-178.us-west-2.compute.amazonaws.com
thanks,
Robin
Created on 11-05-2016 11:02 AM - edited 11-05-2016 11:03 AM
Here's a step by step guide to troubleshoot this error:
http://www.yourtechchick.com/hadoop/failed-receive-heartbeat-agent-cloudera-hadoop/
Created 11-05-2016 11:05 AM
Here's a step by step guide to troubleshoot this error:
http://www.yourtechchick.com/hadoop/failed-receive-heartbeat-agent-cloudera-hadoop/
Created 10-19-2017 02:24 AM
in /etc/hosts for all nodes:
put ip_adress FQDN DN
10.10.1.230 name.domain.com name
the FQDN must be before the name