Created on 02-25-202504:08 AM - edited 02-25-202504:09 AM
Summary
Resolved service errors on Datanode#09 by identifying and addressing NFS mount issues. The Cloudera agent was reconfigured to bypass NFS checks temporarily, allowing services to return to a healthy state.
Symptoms
Multiple service errors reported on Datanode#09.
Cloudera agent was not in contact with Cloudera Manager.
Filesystem usage for certain nodev filesystems was unknown due to an inactive worker process.
Cause
The NFS partition was not properly mounted on the host, which led to problems with the Cloudera agent's health checks.
The Cloudera agent configuration was set to monitor NFS mounts (monitored_nodev_filesystem_types=nfs,nfs4,tmpfs), which failed due to the unmounted NFS partition.
Instructions
Temporarily comment out the NFS monitoring line in the Cloudera agent configuration file to bypass the check and restore agent communication:
Open/etc/cloudera-scm-agent/config.iniusing a text editor.
Comment out the linemonitored_nodev_filesystem_types=nfs,nfs4,tmpfs.
Restart the Cloudera agent service:service cloudera-scm-agent restart.
Once the NFS mount point is recovered with the help of the OS team, restore the original configuration:
Uncomment the previously commented line in/etc/cloudera-scm-agent/config.ini.
Restart the Cloudera agent service to apply changes.