I am trying to add a new host located in a different data-center from where my so far single node cluster is located.
I keep getting in cloudera-scm-agent.log and so my host is never getting live. CM keeps showing bad health and that the host was not in touch with Host monitor even though it is in contact with cloudera manager.
Please help me trouble shoot this. I came across some suggestions on this forum and would like to let you know that :
1. ping with size > 1500 bytes works:
ping -c 3 -s 1800 warehouse.swtched.com
2. I have added the FQDN with public IP of the existing node in /etc/hosts file of this new node I have hosted with AWS
Please suggest what could be the problem.
Note: This new host is hosted with AWS and I have allowed communication from all ports and IP so that shouldnt be a problem.
Also, the old host is not hosted with AWS and is located somewhere else. NTPD is enabled on both the hosts and heartbeat is working fine.
Please feel free to ask questions. Help me troubleshoot this please as it is very specific to cloudera platform and is a problem since nowhere else can I get this answered
[03/Nov/2016 15:21:55 +0000] 56464 MonitorDaemon-Reporter throttling_logger ERROR Error sending messages to firehose: mgmt-HOSTMONITOR-5ccf80948a373fcc0e29b2976ccf7c19 Traceback (most recent call last): File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.0-py2.6.egg/cmf/monitor/firehose.py", line 116, in _send self._port) File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 469, in __init__ self.conn.connect() File "/usr/lib64/python2.6/httplib.py", line 742, in connect self.timeout) File "/usr/lib64/python2.6/socket.py", line 567, in create_connection raise error, msg timeout: timed out
Have you got a solution for this? If so can you please share it. I am trying to spin up a cluster using cloudera director and it fails while bootstrap of cluster nodes with above error.