Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Cloudera Manager - Installation failed. Failed to receive heartbeat from agent.

Explorer

Hi,

 

I'm trying to install Cloudera Manager on CentOS7 (on virtualbox). However, I struct at the 'Add Cluster' step which give me an error 'Installation failed. Failed to receive heartbeat from agent.' from all host. I have tried several solution posted on this community but still unable to solve its. Here is some detail.

 

Detail:

OS: CentOS7

 

Cloudera Manager version: 6.3.1

 

/etc/hosts:

127.0.0.1   localhost   localhost.localdomain   localhost4   localhost4.localdomain4

::1              localhost   localhost.localdomain   localhost6   localhost6.localdomain6

192.168.56.xx1   namenode.localdomain   namenode

192.168.56.xx2   datanode1.localdomain   datanode1

192.168.56.xx3   datanode2.localdomain   datanode2

192.168.56.xx4   datanode3.localdomain   datanode3

192.168.56.106   util.localdomain               util

 

error:

>>[09/Jul/2020 23:45:02 +0000] 10401 MainThread agent ERROR Heartbeating to util.localdomain:7182 failed.
>>Traceback (most recent call last):
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1399, in _send_heartbeat

>> response = self.requestor.request('heartbeat', heartbeat_data)
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 141, in request
>> return self.issue_request(call_request, message_name, request_datum)
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 254, in issue_request
>> call_response = self.transceiver.transceive(call_request)
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 483, in transceive
>> result = self.read_framed_message()
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 489, in read_framed_message
>> framed_message = response_reader.read_framed_message()
>> File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 417, in read_framed_message
>> raise ConnectionClosedException("Reader read 0 bytes.")
>>ConnectionClosedException: Reader read 0 bytes.

 

netstat:

https://imgur.com/PWJhUYt

 

iptables:

https://imgur.com/nqUJKGZ

 

/var/log/cloudera-scm-agent/cloudera-scm-agent.loc:

https://imgur.com/oGnDNyI

 

Thank you.

9 REPLIES 9

Mentor

@Petch 

The problem seems to be with port 7182 on the cluster but I can ss that its OK on host util.localdomain can you validate on these hosts too

192.168.56.xx1 namenode.localdomain namenode
192.168.56.xx2 datanode1.localdomain datanode1
192.168.56.xx3 datanode2.localdomain datanode2
192.168.56.xx4 datanode3.localdomain datanode3

There could be a firewall running on one of those hosts. Please do check that and let me know

 

 

 

Explorer

@Shelton 

Thank you for your reply. I have checked on every hosts as you said the result remain the same.

Mentor

@Petch 

 

Firstly can you ensure all the agents are running on all the hosts?  I would think it a good idea to truncate the old logs

Stop Agent(s)

$ sudo service cloudera-scm-agent stop

Truncate the agent logs

$ sudo truncate --size 0  cloudera-scm-agent.log

Restart Agent(s)

$ sudo service cloudera-scm-agent restart

Checking Agent(s) Status

$ sudo service cloudera-scm-agent status

Then retry  and let me know 

Explorer

@Shelton 

 

I tried with 2 hosts as shown here. Both machine keep return the same error.

Mentor

@Petch 

There are a couple of things I would like you to test and share the results

What is the version of your OpenJDK? Is it higher or lower than 1.8.0_181? If its lower then you will need to upgrade your JDK

Can you test the TLS /SSL negotiation is should return exit code 0

# openssl s_client -connect util.localdomain:7183

 

SELinux or iptables must be disabled.

From datanode1 can you telnet successfully?

# telnet 192.168.56.106 7182

Can you check the hostname is properly configured on util

# hostname -f

 

Please revert 

Explorer

@Shelton 

 

My jdk version is 1.8.0_181.

 

The result from TLS/SSL return code 19 (self signed certificate in certificate chain). So this mean something wrong with TLS/SSL installation right?

Screen Shot 2563-07-14 at 08.25.20.png

 

Telnet can connect successfully and hostname is correctly set.Screen Shot 2563-07-14 at 08.41.19.png

 

Screen Shot 2563-07-14 at 08.41.42.png

Mentor

@Petch 

I don't know whether you unintentionally accept your answer  but it seems you are still questions is your problem resolved  if not then reject the answer and update the thread

Explorer

@Shelton 

I'm new to the security part. I still unable to solve this. Please advise me how to solve the problem. 

Here is what I have done in the installation.

Cloudera Employee

@Petch I see that you are getting this error:

[09/Jul/2020 23:45:02 +0000] 10401 MainThread agent ERROR Heartbeating to util.localdomain:7182 failed.
ConnectionClosedException: Reader read 0 bytes.

 

Usually this error is reported if the CM is looking for the agent to communicate via TLS protocol but the config.ini has this setting 'use_tls' set to 0

 

Check and change it to use_tls=1

Then restart the agent using the following command :

service cloudera-scm-agent restart

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.