Support Questions

karthiknedunche · ‎06-06-2018

Lets take a scenorio i have a 3 journalnode(1,2,3). All three where upto date initially. I stoped the 2nd journal node for some time and now i am going to up the 2nd journal node now how will this journal node will receive and save missed edit logs

Shelton · ‎06-06-2018

@karthik nedunchezhiyan

This potentially happens JournalNodes when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up.

The Active NameNode will write/read edits to the below URI, which is a shared address by the JournalNodes and provides the shared edits storage, it's ONLY written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.

Though you must specify several JournalNode addresses, you should only configure one of these URIs.

dfs.namenode.shared.edits.dir

QuorumJournalManager is responsible for syncing the missing transactions On a journal node, the missing transaction is recovered by the TransferFsImage class from another journal node thats up to date in this case journalnodes (1 and 3)

Started a 3-node QJM cluster
strace -efdatasync,write -f <pid of one JN>
Write some txns to the NN it will show a lot of fdatasync and write calls.
kill -STOPped that JN for 10-15 seconds
kill -CONT that JN

You will see a bunch of write() with no fdatasync calls while it was still catching up.

After it caught up, it started syncing again.

View solution in original post

Shelton · ‎06-06-2018

@karthik nedunchezhiyan

This potentially happens JournalNodes when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up.

The Active NameNode will write/read edits to the below URI, which is a shared address by the JournalNodes and provides the shared edits storage, it's ONLY written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.

Though you must specify several JournalNode addresses, you should only configure one of these URIs.

dfs.namenode.shared.edits.dir

QuorumJournalManager is responsible for syncing the missing transactions On a journal node, the missing transaction is recovered by the TransferFsImage class from another journal node thats up to date in this case journalnodes (1 and 3)

Started a 3-node QJM cluster
strace -efdatasync,write -f <pid of one JN>
Write some txns to the NN it will show a lot of fdatasync and write calls.
kill -STOPped that JN for 10-15 seconds
kill -CONT that JN

You will see a bunch of write() with no fdatasync calls while it was still catching up.

After it caught up, it started syncing again.

karthiknedunche · ‎06-07-2018

@Geoffrey Shelton Okot

Tnx dude now i understand..

Cloudera Community

Support Questions

How journalnode keeps itself upto date?