Created on 12-08-2015 01:14 PM - edited 09-16-2022 02:51 AM
Hello,
I'm attempting to install a 3-node cluster using the Cloudera Manager automated install. The installation fails at the final step, with the "Installation failed. Failed to receive heartbeat from agent." error. Other solutions from this message board haven't worked for me.
Here is the log error:
[08/Dec/2015 16:06:48 +0000] 18303 MainThread agent ERROR Failed to connect to previous supervisor.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 1636, in find_or_start_supervisor
self.get_supervisor_process_info()
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 1845, in get_supervisor_process_info
self.identifier = self.supervisor_client.supervisor.getIdentification()
File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
return self.__send(self.__name, args)
File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
verbose=self.__verbose
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/xmlrpc.py", line 460, in request
self.connection.request('POST', handler, request_body, self.headers)
File "/usr/lib64/python2.6/httplib.py", line 914, in request
self._send_request(method, url, body, headers)
File "/usr/lib64/python2.6/httplib.py", line 951, in _send_request
self.endheaders()
File "/usr/lib64/python2.6/httplib.py", line 908, in endheaders
self._send_output()
File "/usr/lib64/python2.6/httplib.py", line 780, in _send_output
self.send(msg)
File "/usr/lib64/python2.6/httplib.py", line 739, in send
self.connect()
File "/usr/lib64/python2.6/httplib.py", line 720, in connect
self.timeout)
File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
raise error, msg
error: [Errno 111] Connection refused
Here's my hostname info:
[root@Cloudera1 cloudera-scm-agent]# hostname
cloudera1.icgsolutions.com
[root@Cloudera1 cloudera-scm-agent]# hostname -f
cloudera1.icgsolutions.com
[root@Cloudera1 cloudera-scm-agent]# python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'
cloudera1.icgsolutions.com 192.168.0.86
[root@Cloudera1 cloudera-scm-agent]# hostname
cloudera1.icgsolutions.com
[root@Cloudera1 cloudera-scm-agent]# hostname -f
cloudera1.icgsolutions.com
[root@Cloudera1 cloudera-scm-agent]# python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'
cloudera1.icgsolutions.com 192.168.0.86
Here's my hosts file:
192.168.0.106 cloudera3.icgsolutions.com cloudera3 cloudera3.icgsolutions
192.168.0.86 cloudera1.icgsolutions.com cloudera1 cloudera1.icgsolutions
192.168.0.89 cloudera2.icgsolutions.com cloudera2 cloudera2.icgsolutions
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
127.0.0.1 localdomain localhost
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
Any help would be appreciated.
Created 12-08-2015 04:40 PM
Uninstalling and reinstalling everything solved the issue. I'm guessing since my host files were incorrect the first time I tried to install, the re-install process through CM still had some of the old information stored. Thanks for the help!
Created 12-08-2015 03:38 PM
Can you run "service cloudera-scm-server-db status" and (if it's not
running) run "service cloudera-scm-server-db start"?
Your error message might mean that the CM server is unable to talk to the
database.
Created 12-08-2015 04:40 PM
Uninstalling and reinstalling everything solved the issue. I'm guessing since my host files were incorrect the first time I tried to install, the re-install process through CM still had some of the old information stored. Thanks for the help!