Created 12-08-2020 05:14 AM
cloudera-scm-agent.log
Monitor-HostMonitor throttling_logger ERROR
[08/Dec/2020 12:28:50 +0000] 76583 Monitor-HostMonitor throttling_logger ERROR (2 skipped) Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /run,/dev/shm,/sys/fs/cgroup,/run/user/0,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/1074
[08/Dec/2020 12:35:33 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.12 LIFE_MAX:0.09
[08/Dec/2020 12:45:33 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.03 max:0.08 LIFE_MAX:0.09
[08/Dec/2020 12:55:34 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.03 mean:0.06 max:0.35 LIFE_MAX:0.09
[08/Dec/2020 12:55:49 +0000] 76583 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_251" Java(TM) SE Runtime Environment (build 1.8.0_251-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode) for requested version .
[08/Dec/2020 13:05:35 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.03 mean:0.04 max:0.09 LIFE_MAX:0.09
[08/Dec/2020 13:13:50 +0000] 76583 Monitor-HostMonitor throttling_logger ERROR (1 skipped) Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /run,/dev/shm,/sys/fs/cgroup,/run/user/0,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/1074
[08/Dec/2020 13:15:35 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.08 LIFE_MAX:0.09
[08/Dec/2020 13:25:36 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.11 LIFE_MAX:0.09
[08/Dec/2020 13:26:06 +0000] 76583 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_251" Java(TM) SE Runtime Environment (build 1.8.0_251-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode) for requested version .
[08/Dec/2020 13:35:36 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.14 LIFE_MAX:0.09
[08/Dec/2020 13:45:37 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.06 LIFE_MAX:0.09
[08/Dec/2020 13:55:37 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.06 LIFE_MAX:0.09
[08/Dec/2020 13:56:22 +0000] 76583 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_251" Java(TM) SE Runtime Environment (build 1.8.0_251-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode) for requested version .
[08/Dec/2020 14:05:38 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.16 LIFE_MAX:0.09
[08/Dec/2020 14:12:50 +0000] 76583 Monitor-HostMonitor throttling_logger ERROR Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /run,/dev/shm,/sys/fs/cgroup,/run/user/0,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/1074
[08/Dec/2020 14:15:38 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.03 mean:0.05 max:0.10 LIFE_MAX:0.09
[08/Dec/2020 14:25:39 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.03 mean:0.05 max:0.07 LIFE_MAX:0.09
[08/Dec/2020 14:26:24 +0000] 76583 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_251" Java(TM) SE Runtime Environment (build 1.8.0_251-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode) for requested version .
[08/Dec/2020 14:35:40 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.09 LIFE_MAX:0.09
[08/Dec/2020 14:45:40 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.02 mean:0.04 max:0.06 LIFE_MAX:0.09
[08/Dec/2020 14:55:41 +0000] 76583 MainThread heartbeat_tracker INFO HB stats (seconds): num:40 LIFE_MIN:0.02 min:0.03 mean:0.04 max:0.07 LIFE_MAX:0.09
[08/Dec/2020 14:56:26 +0000] 76583 MainThread throttling_logger INFO (14 skipped) Identified java component java8 with full version java version "1.8.0_251" Java(TM) SE Runtime Environment (build 1.8.0_251-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode) for requested version .
[
Created 12-09-2020 08:23 AM
Is this happening on a node install or have this node previously worked?
If the former...
1. Verify SELinux is disabled
2. Run "df" from the CLI to ensure it responds in a timely manner
3. Verify you can ssh to the machine with the credentials you used for install
4. Verify you can reach the agent port (7182) from the CM.
5. Restart the agent
If the later...
1. Have there been any filesystem changes? Check "df" to make sure it responds properly.
2. Verify forward and reverse DNS is working. You can use the CM Network Inspector for this.
3. Restart the agent
Created 12-11-2020 06:16 AM
This is for new cluster. Installations all are done Present it was in production. This was happen Name node. some times in day (10 to 20) min it was not SYNC with NFS server. we are getting all running services facing clock offset Alerts.
Created 12-11-2020 06:24 AM
If I understand your response, you have an NFS Server that’s timing out and clock offsets are out of sync.
For the former, the NFS admin needs to address this. It’s not typical to run NFS on a Namenode BTW.
For time sync, you need to be running NTP or chrony