Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Namenode HA Failure

Rising Star


I came across a particular issue.

The active NameNode (master01) returned a socket timeout on zkfc, soon after he performed automatically failover bringing master02 to active. But master01 remained in a stalemate, on Ambari NameNode could see up, but without a state (active or stand-by); the process of NameNode on server was up and answered the call.

On NameNode log there are no errors, on zkfc log there are some SocketTimeout (I attached the log).

For resolve this situation we had to restart the NameNode service on the master01, which is automatically left in stand-by just started.

Then I tried to do many manual failover and have positive results all time. On system log no have error, lan is always up and no have error for communicate with server.

As I wrote above, the NameNode service is up and running on all 2 server.

Have an idea of what might have happened?

PS:HDFS service work correctly




Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.