Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

When Secondary NameNode performs checkpoint i.e. once it writes the updated fsimage to NameNode, does the old fsimage file gets deleted?

avatar
Rising Star

When Secondary NameNode performs checkpoint i.e. once it writes the updated fsimage to NameNode, does the old fsimage file gets deleted?

1 ACCEPTED SOLUTION

avatar

The prior answer does not actually answer the original stated question.

When Secondary NameNode performs checkpoint i.e. once it writes the updated fsimage to NameNode, does the old fsimage file gets deleted?

Yes, old fsimage files get deleted. However, a certain number of prior fsimage files will be retained. The exact number to retain is controlled by configuration property dfs.namenode.num.checkpoints.retained in hdfs-site.xml. If unspecified, then the default value is 2.

<property>
  <name>dfs.namenode.num.checkpoints.retained</name>
  <value>2</value>
  <description>The number of image checkpoint files (fsimage_*) that will be retained by
  the NameNode and Secondary NameNode in their storage directories. All edit
  logs (stored on edits_* files) necessary to recover an up-to-date namespace from the oldest retained
  checkpoint will also be retained.
  </description>
</property>

The reason for retaining a few prior fsimage files is that it can be useful for post-mortem troubleshooting or in some disastrous cases as a way to restore a cluster to a prior state. (However, restoring an old fsimage will cause loss of all data that was saved after that checkpoint, so this is not standard operating procedure.)

View solution in original post

10 REPLIES 10

avatar
Expert Contributor

@Avinash C

The process of checkpointing is to merge the old fsimage and edits logs to create a checkpoint file. This file is named something like - "fsimage.ckpt_*".

If the checkpointing is successful, then the fsimage.ckpt_* file gets renamed. Internally, the "fsimage.ckpt_*" file is first validated and verified by namenode and then these are renamed to new "fsimage".

If on the other hand, the "fsimage.ckpt_*" are found to be invalid by namenode, then these are not renamed and they stay in the namenode directory. This can later be used for understanding the reason for file to be invalid.

An example of "fsimage.ckpt_*" getting invalid can be, if the namenode got killed while checkpointing was in progress, before renaming the "fsimage.ckpt_*" file to actual fsimage file. This will leave the checkpointing as incomplete and on next NN start ( or checkpoinintg ), it will start the checkpointing again by loading the previous fsimage and applying rest of the edits.

It will not use the last "fsimage.ckpt_*" file.