Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Namenode HA Failure

avatar
Expert Contributor

Hello,

I came across a particular issue.

The active NameNode (master01) returned a socket timeout on zkfc, soon after he performed automatically failover bringing master02 to active. But master01 remained in a stalemate, on Ambari NameNode could see up, but without a state (active or stand-by); the process of NameNode on server was up and answered the call.

On NameNode log there are no errors, on zkfc log there are some SocketTimeout (I attached the log).

For resolve this situation we had to restart the NameNode service on the master01, which is automatically left in stand-by just started.

Then I tried to do many manual failover and have positive results all time. On system log no have error, lan is always up and no have error for communicate with server.

As I wrote above, the NameNode service is up and running on all 2 server.

Have an idea of what might have happened?

PS:HDFS service work correctly

nn-errors.txt

1 ACCEPTED SOLUTION

avatar
1 REPLY 1

avatar