New Contributor
Posts: 3
Registered: ‎12-20-2013

HA Namenode Failover Issues

I am running CDH 4.3 managed via cloudera manager 5 and experiencing failovers multiple times per day. How can I increase the time of 5000 millis below?


I have following set in Failover Controller Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml but it doesn't seem to help and error says it is timing out after 5000 millis.





2014-06-16 17:25:19,053 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at m-hdp-mnode0005/ standby (unable to connect) Call From m-hdp-mnode0006/ to m-hdp-mnode0005:9005 failed on socket timeout exception: 5000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/ remote=m-hdp-mnode0005/]; For more details see:



-- Nick

Posts: 1,825
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: HA Namenode Failover Issues

The ZKFC property for monitorHealth RPC timeouts has been changed to be more specific, and is now called