Created 08-14-2018 03:18 PM
I'm having issues installing the cloudera-scm-agent on the same same machine as the manager. All of the remote hosts installed fine, just not the agent on the manager itsself.
The wizard gets the packages installed but then the agent cant seem to talk to the manager.
Wizard shows: Installation failed. Failed to receive heartbeat from agent.
and in the log in /var/log/cloudera-scm-agent/cloudera-scm-agent.log I see:
[14/Aug/2018 17:07:07 +0000] 18986 MainThread agent ERROR Failed to connect to previous supervisor.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/agent.py", line 2137, in find_or_start_supervisor
self.get_supervisor_process_info()
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/agent.py", line 2281, in get_supervisor_process_info
self.identifier = self.supervisor_client.supervisor.getIdentification()
File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
return self.__send(self.__name, args)
File "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
verbose=self.__verbose
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 470, in request
'' )
ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized>
[14/Aug/2018 17:07:07 +0000] 18986 MainThread tmpfs INFO Reusing mounted tmpfs at /run/cloudera-scm-agent/process
[14/Aug/2018 17:07:08 +0000] 18986 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 1)
[14/Aug/2018 17:07:08 +0000] 18986 MainThread agent ERROR Failed! trying again in 1 second(s)
any suggestions?
Created 08-24-2018 06:10 PM
The exception means that the agent cannot log into the supervisor. For some reason the agent's password is not the same as the running supervisor. I wonder if this host had an agent already running when you tried to install or something like that.
To correct this, check for running supervisor processes
For example "ps aux |grep supervisor | grep agent" and kill them if they are any
When no supervisor processes are running, try starting the agent again and see if it starts/connects to a new supervisor.
Created 08-24-2018 06:10 PM
The exception means that the agent cannot log into the supervisor. For some reason the agent's password is not the same as the running supervisor. I wonder if this host had an agent already running when you tried to install or something like that.
To correct this, check for running supervisor processes
For example "ps aux |grep supervisor | grep agent" and kill them if they are any
When no supervisor processes are running, try starting the agent again and see if it starts/connects to a new supervisor.