Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How journalnode keeps itself upto date?

avatar
Rising Star

Lets take a scenorio i have a 3 journalnode(1,2,3). All three where upto date initially. I stoped the 2nd journal node for some time and now i am going to up the 2nd journal node now how will this journal node will receive and save missed edit logs

1 ACCEPTED SOLUTION

avatar
Master Mentor

@karthik nedunchezhiyan

This potentially happens JournalNodes when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up.

The Active NameNode will write/read edits to the below URI, which is a shared address by the JournalNodes and provides the shared edits storage, it's ONLY written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.

Though you must specify several JournalNode addresses, you should only configure one of these URIs.

dfs.namenode.shared.edits.dir 

QuorumJournalManager is responsible for syncing the missing transactions On a journal node, the missing transaction is recovered by the TransferFsImage class from another journal node thats up to date in this case journalnodes (1 and 3)

  • Started a 3-node QJM cluster
  • strace -efdatasync,write -f <pid of one JN>
  • Write some txns to the NN it will show a lot of fdatasync and write calls.
  • kill -STOPped that JN for 10-15 seconds
  • kill -CONT that JN

You will see a bunch of write() with no fdatasync calls while it was still catching up.

After it caught up, it started syncing again.

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@karthik nedunchezhiyan

This potentially happens JournalNodes when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up.

The Active NameNode will write/read edits to the below URI, which is a shared address by the JournalNodes and provides the shared edits storage, it's ONLY written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.

Though you must specify several JournalNode addresses, you should only configure one of these URIs.

dfs.namenode.shared.edits.dir 

QuorumJournalManager is responsible for syncing the missing transactions On a journal node, the missing transaction is recovered by the TransferFsImage class from another journal node thats up to date in this case journalnodes (1 and 3)

  • Started a 3-node QJM cluster
  • strace -efdatasync,write -f <pid of one JN>
  • Write some txns to the NN it will show a lot of fdatasync and write calls.
  • kill -STOPped that JN for 10-15 seconds
  • kill -CONT that JN

You will see a bunch of write() with no fdatasync calls while it was still catching up.

After it caught up, it started syncing again.

avatar
Rising Star

@Geoffrey Shelton Okot

Tnx dude now i understand..