11-02-2016 11:10 AM - edited 11-02-2016 11:15 AM
I tried to add a new host to my so far single node cluster that is solely running all the services.
Before I added it, (out of a little confusion and wanting to make sure it would work), I edited /etc/hosts file
Public IP FQDN (of the new host I was going to add in the existing node)
Now, this is not the IP I get through ifconfig but the public IP. (Also, why do I see people adding private IP's here. I tried and couldnot ping with it to switched to public IP instead. If private IP isn't visible to outside world, how would it be able to resolve and ping ?)
While adding the host however, I got some warnings:
PLease check the snapshot attached.
Similarly, I added Public IP of existing host to the etc/hosts file in new host and I could ping them. However, as I look at the warnings:
In this DNS lookup, it says: Got 'public IP' but expected 'private IP' to which I thought, alright I will take it off from my etc hosts file at all.
Although I do not understand why the DNS reverse look up fails in both the cases.
Problem: Now, the problem is that although agent heartbeart is available, in cloudera manager,it says:
This host is in contact with the Cloudera Manager Server. This host is not in contact with the Host Monitor.
How do I fix this? I took off the IP's I added in etc hosts from both the systems and restarted cloudera scm (and other services as well).
But this would not resolve it.
2. I haven't added any services to the new host yet since it asked for a host template (and I didn't have one at that time) so could this be the possible reason? Can I add services to the host and check if it helps even if this error continues to show up. If yes, how do I add services to this new host?
11-03-2016 02:33 AM
Also, I get this error in cloudera scm agent log of the new host:
[03/Nov/2016 14:56:12 +0000] 50921 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.56 min:0.53 mean:0.56 max:0.57 LIFE_MAX:1.07 [03/Nov/2016 14:57:41 +0000] 50921 MonitorDaemon-Reporter throttling_logger ERROR (9 skipped) Error sending messages to firehose: mgmt-HOSTMONITOR-5ccf80948a373fcc0e29b2976ccf7c19 Traceback (most recent call last): File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.0-py2.6.egg/cmf/monitor/firehose.py", line 116, in _send self._port) File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 469, in __init__ self.conn.connect() File "/usr/lib64/python2.6/httplib.py", line 742, in connect self.timeout) File "/usr/lib64/python2.6/socket.py", line 567, in create_connection raise error, msg timeout: timed out