Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Logic behind starting active Namenode

avatar

Hi,

When we are restarting both NN in HA environment, how can we guess which one will be active namenode and which one move to Standby namenode mode?

Do we have any logic or just like which every come 1st will be active and late one will be passive?

Any information is highly appropriated.

Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Master Guru
@SBandaru

Suppose you have two Namenodes A and B.

A is active and B is Standby --> If you restart Namenode A, ZKFC(Zookeeper failover controller) will detect that Namenode A is not reachable(When daemon is restarting), fencing will happen and Namenode B will be active NN.

---

B is active and A is Standby --> If you restart both the Namenodes, whichever comes up first and respond to ZKFC will become active and ultimately another one will become standby.

---

Now both Namenode thinks that they are active and sends write request to quorum journal manager with their epoc number, how QJM handles this situation?

Quorum journal manager stores epoc number locally which called as promised epoc. Whenever JournalNode receives RPC request along with epoc number from Namenode, it compares the epoch number with promised epoch. If request is coming from newer node which means epoc number is greater than promised epoc then itrecords new epoc number as promised epoc. If the request is coming from Namenode with older epoc number, then QJM simply rejects the request.

Please refer - https://community.hortonworks.com/articles/27225/how-qjm-works-in-namenode-ha.html for more details about how QJM works.

Hope this information helps! 🙂

View solution in original post

1 REPLY 1

avatar
Master Guru
@SBandaru

Suppose you have two Namenodes A and B.

A is active and B is Standby --> If you restart Namenode A, ZKFC(Zookeeper failover controller) will detect that Namenode A is not reachable(When daemon is restarting), fencing will happen and Namenode B will be active NN.

---

B is active and A is Standby --> If you restart both the Namenodes, whichever comes up first and respond to ZKFC will become active and ultimately another one will become standby.

---

Now both Namenode thinks that they are active and sends write request to quorum journal manager with their epoc number, how QJM handles this situation?

Quorum journal manager stores epoc number locally which called as promised epoc. Whenever JournalNode receives RPC request along with epoc number from Namenode, it compares the epoch number with promised epoch. If request is coming from newer node which means epoc number is greater than promised epoc then itrecords new epoc number as promised epoc. If the request is coming from Namenode with older epoc number, then QJM simply rejects the request.

Please refer - https://community.hortonworks.com/articles/27225/how-qjm-works-in-namenode-ha.html for more details about how QJM works.

Hope this information helps! 🙂