NameNode High Availability Health Ambari alert

We are getting "NameNode High Availability Health"alert frequently from ambari notifications .

Here ,we have observed Standby or Active nodes frequently going to Unknown stage and coming back immidiately .

Active['xxxxxxx:50070'], Standby[], Unknown['xxxxxx"50070'] . There are log records are captured related to this in ambari-alerts.log . What will be the reason to change Active/Standby to Unknown . ? What will be the solution for this . Thanks in Advance .

Looks like the issue simmilar to :


Hi Srini,

do you solved your issue ?

I have the same issue and when I have it, it is linked with an error message in the Active (or Standby) NameNode

log (hadoop-hdfs-namenode-ServerXXXX.log) like :

"2017-12-23 07:47:01,761 INFO util.JvmPauseMonitor ( - Detected pause in JVM or host machine (eg GC): pause of approximately 4572ms
No GCs detected"

Now, I need to understand which process causes the process load issue.

I hope it can help you.

