Support Questions

Find answers, ask questions, and share your expertise

Failed to connect to previous supervisor

avatar
Explorer

I've chosen to install CDH through Automated installer using Cloudera Manager, the download completes but unable to push through due this error:

 

Installation failed. Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).
Ensure that ports 9000 and 9001 are not in use on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added. (Some of the logs can be found in the installation details).
If Use TLS Encryption for Agents is enabled in Cloudera Manager (Administration -> Settings -> Security), ensure that /etc/cloudera-scm-agent/config.ini has use_tls=1 on the host being added. Restart the corresponding agent and click the Retry link here.

However upon checking the details, I saw Failed to connect to previous supervisor in this error details:

Installation script completed successfully.
all done
closing logging file descriptor
>>[18/Jul/2017 01:34:30 +0000] 4355 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor
>>[18/Jul/2017 01:34:30 +0000] 4355 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/flood
>>[18/Jul/2017 01:34:30 +0000] 4355 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include
>>[18/Jul/2017 01:34:30 +0000] 4355 MainThread agent ERROR Failed to connect to previous supervisor.
>>Traceback (most recent call last):
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.0-py2.7.egg/cmf/agent.py", line 2109, in find_or_start_supervisor
>> self.configure_supervisor_clients()
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.0-py2.7.egg/cmf/agent.py", line 2290, in configure_supervisor_clients
>> supervisor_options.realize(args=["-c", os.path.join(self.supervisor_dir, "supervisord.conf")])
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 1599, in realize
>> Options.realize(self, *arg, **kw)
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 333, in realize
>> self.process_config()
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 341, in process_config
>> self.process_config_file(do_usage)
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 376, in process_config_file
>> self.usage(str(msg))
>> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 164, in usage
>> self.exit(2)
>>SystemExit: 2
>>[18/Jul/2017 01:34:30 +0000] 4355 Dummy-1 daemonize WARNING Stopping daemon.
>>[18/Jul/2017 01:34:30 +0000] 4355 Dummy-1 agent INFO Stopping agent...
>>[18/Jul/2017 01:34:30 +0000] 4355 Dummy-1 agent INFO No extant cgroups; unmounting any cgroup roots
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO SCM Agent Version: 5.12.0
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Agent Protocol Version: 4
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Using Host ID: b9e306ab-b527-4667-9f3e-b6acad9f5224
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Using directory: /run/cloudera-scm-agent
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Using supervisor binary path: /usr/lib64/cmf/agent/build/env/bin/supervisord
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Neither verify_cert_file nor verify_cert_dir are configured. Not performing validation of server certificates in HTTPS communication. These options can be configured in this agent's config.ini file to enable certificate validation.
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Agent Logging Level: INFO
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO No command line vars
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Missing database jar: /usr/share/java/mysql-connector-java.jar (normal, if you're not using this database type)
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type)
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar
>>[18/Jul/2017 01:39:14 +0000] 5611 MainThread agent INFO Agent starting as pid 5611 user root(0) group root(0).


This is my current setup.

CentOS 7.2
Installing CDH 5.11.1 or 5.12 using Cloudera Manager.
4 nodes

/etc/hosts

192.168.0.101 node1.cirro.com node1
192.168.0.102 node2.cirro.com node2
192.168.0.103 node3.cirro.com node3
192.168.0.104 node4.cirro.com node4

/etc/sysconfig/network

NETWORKING=yes
HOSTNAME=myservers*.cirro.com
NOZEROCONF=yes

/etc/ssh/sshd_config

PermitRootLogin yes
PasswordAuthentication yes

hostname has also been set per node to reflect /etc/sysconfig/network.

sestatus = disabled
firewalld = inactive
ntpd = active (running)
httpd = active (running)
vm.swappiness = 10
user = passwordless sudo
/etc/rc.local has been set

Can anyone help me on this? I've been stuck with this for 2 weeks now. I've run out of options and searching online. It would be really appreciated!

1 ACCEPTED SOLUTION

avatar
Explorer

This is what I did instead. I followed Path B and downloaded 5.11.1 version instead. Solved all of my problems.

View solution in original post

29 REPLIES 29

avatar
Champion

You need to chang the /etc/sysconfig/network file each node accordinly for example 

 

Node 1 

/etc/sysconfig/network on node 1  

NETWORKING=yes
HOSTNAME=node1 
NETWORKING_IPV6=no

 

 

Restart the network and you should be able to fix the error 

 

Let me know if that helps

avatar
Explorer

Hello CSGUNA !

 

Thanks for help, but what do you recommend for ubuntu 14 ?

avatar
Champion

@darkdante 

 

I dont have the ubuntu on my test machine but I am na tell you anyways 

do the same  in Ubuntu 

 

/etc/hosts

192.168.200.11 Master

In the Master node
The /etc/hostname file should contain
Master

avatar
Champion

@darkdante  were you able to receive heartbeat from agent.

 

avatar
Explorer

Hi csguna,

 

Here's the cloudera-scm-agent.log

 

 

[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels.
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Created /run/cloudera-scm-agent/process
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Chmod'ing /run/cloudera-scm-agent/process to 0751
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Created /run/cloudera-scm-agent/supervisor
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Chmod'ing /run/cloudera-scm-agent/supervisor to 0751
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Created /run/cloudera-scm-agent/flood
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Chowning /run/cloudera-scm-agent/flood to cloudera-scm (988) cloudera-scm (983)
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Chmod'ing /run/cloudera-scm-agent/flood to 0751
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Created /run/cloudera-scm-agent/supervisor/include
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        INFO     Chmod'ing /run/cloudera-scm-agent/supervisor/include to 0751
[18/Jul/2017 17:38:20 +0000] 15294 MainThread agent        ERROR    Failed to connect to previous supervisor.
Traceback (most recent call last):
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.0-py2.7.egg/cmf/agent.py", line 2109, in find_or_start_supervisor
    self.configure_supervisor_clients()
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.0-py2.7.egg/cmf/agent.py", line 2290, in configure_supervisor_clients
    supervisor_options.realize(args=["-c", os.path.join(self.supervisor_dir, "supervisord.conf")])
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 1599, in realize
    Options.realize(self, *arg, **kw)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 333, in realize
    self.process_config()
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 341, in process_config
    self.process_config_file(do_usage)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 376, in process_config_file
    self.usage(str(msg))
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 164, in usage
    self.exit(2)
SystemExit: 2
[18/Jul/2017 17:38:21 +0000] 15294 Dummy-1 daemonize    WARNING  Stopping daemon.
[18/Jul/2017 17:38:21 +0000] 15294 Dummy-1 agent        INFO     Stopping agent...
[18/Jul/2017 17:38:21 +0000] 15294 Dummy-1 agent        INFO     No extant cgroups; unmounting any cgroup roots

and this is the result of ps aux | grep supervisor

 

root      15823  0.0  0.0 112644   972 pts/1    S+   17:57   0:00 grep --color=auto supervisor

As of today, it still outputs heartbeat failure.

avatar
Champion

bear with me but did you change the /etc/sysconfig/network file in all the nodes 1 2 3 4 as mentioned earlier ? 

did you restart the network  in your os ? 

 

avatar
New Contributor

At the end what worked for me was:

1) kernel update to the newest version

2) removal of all the alternatives to java and javac, e.g. https://askubuntu.com/questions/613016/removing-oracle-jdk-and-re-configuring-update-alternatives

avatar
Explorer

This is what I did instead. I followed Path B and downloaded 5.11.1 version instead. Solved all of my problems.

avatar
Explorer

I still have the same problem with slight modifications 🙂

 

I was able to roll back the agent to 5.11, but the rest of cluster were successfully upgrated to 5.12.

So, the 5.11 agent works fine, but the node is not visible on the Cloudera Manager 5.12.

 

avatar
Explorer

Using 5.11 instead of 5.13 did not work for me. Neither did work the change in /etc/sysconfig/network:

NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=poweredge