Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

adding host agent to cluster fails

adding host agent to cluster fails

Explorer

After using Cloudera wizard to add host to cluster ... the slave agent logs show the error below

[07/Mar/2019 18:54:41 +0000] 20033 MainThread agent        INFO     To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels.
[07/Mar/2019 18:54:43 +0000] 20033 MainThread supervisor   INFO     Trying to connect to supervisor (Attempt 1)
[07/Mar/2019 18:54:43 +0000] 20033 MainThread supervisor   INFO     Supervisor version: 3.0, pid: 18803
[07/Mar/2019 18:54:43 +0000] 20033 MainThread supervisor   INFO     Successfully connected to supervisor
[07/Mar/2019 18:54:43 +0000] 20033 MainThread agent        INFO     Supervisor version: 3.0, pid: 18803
[07/Mar/2019 18:54:43 +0000] 20033 MainThread agent        INFO     Connecting to previous supervisor: agent-18803-1551977586.
[07/Mar/2019 18:54:45 +0000] 20033 MainThread supervisor   INFO     Triggering supervisord update.
[07/Mar/2019 18:54:45 +0000] 20033 MainThread _cplogging   INFO     [07/Mar/2019:18:54:45] ENGINE Bus STARTING
[07/Mar/2019 18:54:45 +0000] 20033 MainThread _cplogging   INFO     [07/Mar/2019:18:54:45] ENGINE Started monitor thread '_TimeoutMonitor'.
[07/Mar/2019 18:54:45 +0000] 20033 MainThread _cplogging   INFO     [07/Mar/2019:18:54:45] ENGINE Serving on http://127.0.0.1:9001
[07/Mar/2019 18:54:45 +0000] 20033 MainThread _cplogging   INFO     [07/Mar/2019:18:54:45] ENGINE Bus STARTED
[07/Mar/2019 18:54:45 +0000] 20033 MainThread daemon       INFO     New monitor: (<cmf.monitor.host.HostMonitor object at 0x7f58b56cdb10>,)
[07/Mar/2019 18:54:45 +0000] 20033 MonitorDaemon-Scheduler daemon       INFO     Monitor ready to report: ('HostMonitor',)
[07/Mar/2019 18:54:45 +0000] 20033 MainThread agent        INFO     Setting default socket timeout to 45
[07/Mar/2019 18:54:45 +0000] 20033 MainThread agent        INFO     Previously active parcels: {'SPARK2': '2.3.0.cloudera4-1.cdh5.13.3.p0.611179', 'CDH': '5.14.4-1.cdh5.14.4.p0.3'}
[07/Mar/2019 18:54:45 +0000] 20033 MainThread agent        INFO     Loading last saved hb response to complete initialization: /var/lib/cloudera-scm-agent/response.avro
[07/Mar/2019 18:54:45 +0000] 20033 Monitor-HostMonitor network_interfaces INFO     NIC iface virbr0 doesn't support ETHTOOL (95)
[07/Mar/2019 18:54:45 +0000] 20033 MainThread heartbeat_tracker INFO     HB stats (seconds): num:1 LIFE_MIN:0.02 min:0.02 mean:0.02 max:0.02 LIFE_MAX:0.02
[07/Mar/2019 18:55:52 +0000] 20033 Monitor-HostMonitor throttling_logger ERROR    Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/user/1000,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/0

and CM wizard displays Error message

Failed to receive heart beat from agent

Capture.JPG 

15 REPLIES 15

Re: adding host agent to cluster fails

Expert Contributor

Hello @Exor,

 

Lets step back a bit. To understand the issue better, I would request you to please help with couple of pre-lims:

  • Is agent service installed and agent log is clean?
  • Is connectivity from agent to server on configured port working?

Hope this gives more clarity on next steps.

Re: adding host agent to cluster fails

Explorer
 

Re: adding host agent to cluster fails

Explorer

Hi! We have the same problem! In log details we found error: "Monitor-HostMonitor throttling_logger ERROR Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/1001,/run/user/1003,/run/user/0" 
  At the moment we tried to install Cloudera manager 6.2.0 and 6.1.1, but result the same. Agent host has no problems with connectivity to Manager host (Checked it by command "telnlet <Cloudera manager machine ip address or name>  7182" which was successfully connected. Also command "ss -anp" showed "established" connection on both hosts.) 

Highlighted

Re: adding host agent to cluster fails

Super Guru

Hi @Rasgeado ,

 

When you say that you have the same problem, what is the issue exactly?  After the addition of a host fails, if you open up CM and view the new host in the Hosts tab, does it show in bad health?  If so, click on that host to view any host health errors.  This will give us the first clue.

 

Next, review the agent log on that host (normally /var/log/cloudera-scm-agent/cloudera-scm-agent.log).
While it is possible that the HostMonitor Error is related, it is not likely since the timeout is 2 seconds.  More information about the problem would be good so we can come up with good possible causes.

Re: adding host agent to cluster fails

Explorer

Hi,@bgooley ! Thx for reply. the main trouble is that we cant pass "Install agents" step due to the error:

"Monitor-HostMonitor throttling_logger ERROR Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/1001,/run/user/1003,/run/user/0"  - which I found in "Details". And there are no errors anymore, only this one.

cloudera_error.png

If I open up CM and go to the Hosts tab then there are no hostst added, except that one, on which CM is installed.

Which additional information can I provide to solve the issue?

 

 

Re: adding host agent to cluster fails

Super Guru

Hi @Rasgeado ,

 

Have you checked the /var/log/cloudera-scm-agent/cloudera-scm-agent.log file on the host you are trying to add.  CM executes an scm_prepare_node script on the host, so it sounds as if the steps leading up to the heartbeat detection succeed.  The most useful information, then, would be in that log.

 

You might look for errors or messages regarding the heartbeat.

 

Try restarting the agent if you don't see any errors pertaining to the heartbeat:

 

# service cloudera-scm-agent restart

 

Then review the log for any heartbeat errors or messages.

Re: adding host agent to cluster fails

New Contributor

try to manually ssh between the ambari host and new host using the private/public key pair via terminal, in some cases a first time connection needs to be established to add the host to the known hosts official site

Re: adding host agent to cluster fails

New Contributor

Hi check if the hostname was resolved correctly , try to disable firewall (if is enable) e launch from ambari/cdh host inspector to identified miss configuration at network level.

BR

Gianluca

Re: adding host agent to cluster fails

Explorer

Hi! Thanks for the answer!
We're using cloudera hadoop about two years and installed it many times, but run into the trouble for the first time.
Hostnames are resolved correctly (we have own dns with forward and reverse zones) and there is no firewall rules on any host at all.
Selinux is disabled. Moreover, i can telnet to CM host by its hostname from agents hosts on port 7182, but still watch this annoying error.
We can't run host inspector, because hosts cant pass the "Install agents" step due to this error. Now we have no idea what the problem is and what to do.