Created 06-06-2018 12:10 PM
Lets take a scenorio i have a 3 journalnode(1,2,3). All three where upto date initially. I stoped the 2nd journal node for some time and now i am going to up the 2nd journal node now how will this journal node will receive and save missed edit logs
Created 06-06-2018 01:13 PM
This potentially happens JournalNodes when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up.
The Active NameNode will write/read edits to the below URI, which is a shared address by the JournalNodes and provides the shared edits storage, it's ONLY written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.
Though you must specify several JournalNode addresses, you should only configure one of these URIs.
dfs.namenode.shared.edits.dir
QuorumJournalManager is responsible for syncing the missing transactions On a journal node, the missing transaction is recovered by the TransferFsImage class from another journal node thats up to date in this case journalnodes (1 and 3)
You will see a bunch of write() with no fdatasync calls while it was still catching up.
After it caught up, it started syncing again.
Created 06-06-2018 01:13 PM
This potentially happens JournalNodes when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up.
The Active NameNode will write/read edits to the below URI, which is a shared address by the JournalNodes and provides the shared edits storage, it's ONLY written to by the Active nameNode and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.
Though you must specify several JournalNode addresses, you should only configure one of these URIs.
dfs.namenode.shared.edits.dir
QuorumJournalManager is responsible for syncing the missing transactions On a journal node, the missing transaction is recovered by the TransferFsImage class from another journal node thats up to date in this case journalnodes (1 and 3)
You will see a bunch of write() with no fdatasync calls while it was still catching up.
After it caught up, it started syncing again.
Created 06-07-2018 03:50 AM
Tnx dude now i understand..