Support Questions

Find answers, ask questions, and share your expertise

Difference between journal and edits?

avatar

Hi,

Can anyone explain me the difference of journal and edits.

1 ACCEPTED SOLUTION

avatar

File system metadata is stored in two different constructs: the fsimage and the edit log. The fsimage is a file that represents a point-in-time snapshot of the filesystem’s metadata. However, while the fsimage file format is very efficient to read. Thus, rather than writing a new fsimage every time the namespace is modified, the NameNode instead records the modifying operation in the edit log for durability. This way, if the NameNode crashes, it can restore its state by first loading the fsimage then replaying all the operations (also called edits or transactions) in the edit log to catch up to the most recent state of the file system.

Edit log modifications must be written to a majority of Journal Nodes. This will allow the system to tolerate the failure of a single machine.

View solution in original post

3 REPLIES 3

avatar
Master Guru

https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.ht...

The Edit log is the "transaction log" in HDFS. This means a transaction ( create a file, delete it ... ) is committed once it has been persisted to the edit log. In the good old times the edit log was local to the Namenode and merged into the FSImage by the secondary namenode. In HDFS HA the Edit log has been distributed to three Journalnodes. They still write an edit log but now on in a Quorum. ( I.e. the change needs to be persisted by a majority of the journalnodes ). But really the link explains it all very nicely.

avatar

@Benjamin Leonhardi, thanks for sharing this useful information and link.

avatar

File system metadata is stored in two different constructs: the fsimage and the edit log. The fsimage is a file that represents a point-in-time snapshot of the filesystem’s metadata. However, while the fsimage file format is very efficient to read. Thus, rather than writing a new fsimage every time the namespace is modified, the NameNode instead records the modifying operation in the edit log for durability. This way, if the NameNode crashes, it can restore its state by first loading the fsimage then replaying all the operations (also called edits or transactions) in the edit log to catch up to the most recent state of the file system.

Edit log modifications must be written to a majority of Journal Nodes. This will allow the system to tolerate the failure of a single machine.