Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How actually namenode HA QJM works?

avatar
Rising Star

I read the QJM in hadoop official documentation but there is no clear control flow explanation

So far I understand was

  • When the namenode fails zookeeper failure controller will detect and change the standby namenode as active namenode.

  • Only active namenode will write the edit log in Journal node

  • Journal will sync the edit log with standby namenode

Can any one please explain how the real flow works?

For example

  • step 1: when client connects request goes here ....
  • step 2: it will take care of these request ....
  • step 3: if it fails it will happen the request will go there....

Like that can anyone please explain the complete flow of QJM

Thanks in advance

1 ACCEPTED SOLUTION

avatar
Rising Star

In HA there will be two Namenodes, one is Active and other one is in the Standby state.The Active Namenode is primarily responsible for all running, upcoming operations and client requests in the cluster. At same time Standby acts as a slave.To maintain synchronization between Active and Standby nodes they both communicate with a group of demons called JournalNodes(JNs).

If there any Namespace modifications done by Active name node logs a modification record to a majority of these present JNs.The standby node reads the edits from the JNs and continuously watches the changes in edit log from JNs. Standby node updates its own namespace of observes in edits every time.JNs are shared edits in presence of QJM.when failover happens, the Standby ensures that it has read all edits from JNs before it takes over the Active position.

Standby acts as secondary name node because it performs all tasks done by secondary name node. So in HA configuration cluster no need of Secondary Namenode.

Hope this helps you. @karthik nedunchezhiyan

View solution in original post

8 REPLIES 8

avatar

avatar
Rising Star

Yes i saw that but he didn't explain the actual work flow

avatar
Rising Star

What will happen is edit log on Journal node becomes large?

Will standby namenode send new FSimage to active namenode?

How client finds the active namenode?

avatar
Rising Star

In HA there will be two Namenodes, one is Active and other one is in the Standby state.The Active Namenode is primarily responsible for all running, upcoming operations and client requests in the cluster. At same time Standby acts as a slave.To maintain synchronization between Active and Standby nodes they both communicate with a group of demons called JournalNodes(JNs).

If there any Namespace modifications done by Active name node logs a modification record to a majority of these present JNs.The standby node reads the edits from the JNs and continuously watches the changes in edit log from JNs. Standby node updates its own namespace of observes in edits every time.JNs are shared edits in presence of QJM.when failover happens, the Standby ensures that it has read all edits from JNs before it takes over the Active position.

Standby acts as secondary name node because it performs all tasks done by secondary name node. So in HA configuration cluster no need of Secondary Namenode.

Hope this helps you. @karthik nedunchezhiyan

avatar
Rising Star

What will happen is edit log on Journal node becomes large?

Will standby namenode send new FSimage to active namenode?

How client finds the active namenode?

avatar
New Contributor

Hello! Can the user send request to both standby and active name node? And if the name node fails to fulfill the request, standby namenode can replace the current active name node? 

avatar
New Contributor

@karthik nedunchezhiyan A simplified explanation of the process :

Whenever a NN HA is achieved, there will be two NNs , One Active NN and other Standby NN,

1) DataNodes will send heartbeats to both NNs , so both Active and Standby will know where the blocks are placed.

2) Journal Nodes maintain the Shared edits , Whenever there is a write operation the JNs will update the edits, not the Active or Standby NN. Once the edits are updated by JN, the Standby will update its FS Image.

3)So this way at any point in time both the Active and the Standby will have the same updated FS Image.

4)Zookeeper will be responsible for holding the lock for the Active NN.

5) There will be two Zookeeper Failover Controllers, which will be responsible for monitoring the health of the NNs.

6) Whenever the Zookeeper does not receive a communication from the Zookeeper FC, it will release the lock and this will be acquired by the other Zookeeper FC and the Standby NN will become the Active NN.

58436-nnha.png

avatar
Community Manager

@Ankita3087, Welcome to Cloudera Community. As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: