08-07-2017 11:39 AM
Hi, I am trying to install cloudera-manager 5.12.0, but during installation of agents in final stage I recieve error like this << Failed to receive heartbeat from agent. >>
In log file /var/log/cloudera-scm-agent/cloudera-scm-agent.log I found an error which said << Failed to connect to previous supervisor. >>
I started to debug cloudera-scm-agent code and found that it fail when it is trying to find supervisord.conf in /run/cloudera-scm-agent/supervisor/ directory.
In file called agent.py there is function called configure_supervisor_clients(self) and it fails on line 2 when it trying to create path for supervisord.conf because there is no such file in that path.
1 def configure_supervisor_clients(self): """ Configure the supervisor client. We could do the XMLRPC configuration manually, but why bother if there's helper code to do it for us! """ 1. supervisor_options = supervisor.options.ClientOptions() 2. supervisor_options.realize(args=["-c", os.path.join(self.supervisor_dir, "supervisord.conf")]) 3. self.supervisor_client = supervisor_options.getServerProxy() 4. self.supervisor_ctl = supervisor.supervisorctl.Controller(supervisor_options)
I know that agent are responsible to create that conf file but it seems that it is not creating that file at all.
Can anyone help me with this problem ?
I do all my installations with cloudera-manager-installer.bin script.
08-07-2017 05:48 PM
08-07-2017 07:43 PM
I have already checked the /etc/hosts file,/etc/sysconfig/network file and hostname, it's all ok.
it's likely the supervisor can't run when the agent can't generate the supervisor conf file.
08-09-2017 03:30 AM - edited 08-09-2017 04:16 AM
check if there are /run/cloudera-scm-agent/supervisor/supervisord.conf and .pid files and include/flood.conf on your failing host. If not, copy them from another host. Then run supervisord daemon and retry the installation.
08-09-2017 03:41 AM
08-09-2017 04:47 AM
I've coped supervisord.conf supervisord.pid and flood.conf to right dir, but it still doesn't work
error info is different
[09/Aug/2017 19:44:56 +0000] 8239 MainThread agent ERROR Failed to connect to previous supervisor.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.0-py2.7.egg/cmf/agent.py", line 2110, in find_or_start_supervisor
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.0-py2.7.egg/cmf/agent.py", line 2254, in get_supervisor_process_info
self.identifier = self.supervisor_client.supervisor.getIdentification()
File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
return self.__send(self.__name, args)
File "/usr/lib64/python2.7/xmlrpclib.py", line 1587, in __request
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request
self.connection.request('POST', handler, request_body, self.headers)
File "/usr/lib64/python2.7/httplib.py", line 1017, in request
self._send_request(method, url, body, headers)
File "/usr/lib64/python2.7/httplib.py", line 1051, in _send_request
File "/usr/lib64/python2.7/httplib.py", line 1013, in endheaders
File "/usr/lib64/python2.7/httplib.py", line 864, in _send_output
File "/usr/lib64/python2.7/httplib.py", line 826, in send
File "/usr/lib64/python2.7/httplib.py", line 807, in connect
File "/usr/lib64/python2.7/socket.py", line 571, in create_connection