I had a health alert in the cloudera manager (CM 4.7.2) after our nameNode was restarted (NameNoid connectivity problems) on all my data nodes. I am able to clear the alert by a restart but instead of restarting all my datanodes is there a way to clear the alert?
@liam821 Can you clarify what you mean by "clear the health alert"? In Cloudera Manager, once the error condition goes away, the health should be returned to a good status automatically in CM. You should not have to do anything to reset the health state of the service. What alert are you referring to?
Thanks for follow up. What version of CDH is this? Also, do you have HA namenodes enabled with failover, etc? When you say you restarted the NN, this would bring the cluster down anyway, without HA, so I'm just trying to better understand the situation you're seeing. I believe there's a known issue where DNs will log a warning message about the primary NN after a failover event, but it was a red herring message and should not affect CM's health checks.
Are you sure that HDFS is actually working after the restart? Can you write data or "copyToLocal" a file from HDFS to your local filesystem?