Support Questions

ajmariella0713 · ‎03-21-2024

9een · ‎03-21-2024

edits file: An edits file contains a log of all transactions after the most recent fsimage file and contains the transactions of the file system changes (like create file, delete file, permissions change etc)

The Checkpointing process will then periodically merge the content of the most recent fsimage with the edits (containing new transactions) to create a new fsimage.

Although the edits log file are redundant after they are merged in fsimage, they are kept for safety/potential recovery requirement reasons and is part of the regular design. This should be finite by default however. The two configuration parameters to control this are:

a. "dfs.namenode.num.extra.edits.retained" (default 1000000) : This determines many transactions to keep, regardless of how many edit files they are spread across.
b. "dfs.namenode.max.extra.edits.segments.retained" (default 10000). This serves as a secondary cap for the former. This means that around 10000 extra files would be kept at all times, as long as those 10000 files keep about 1 million edits (per above configuration parameter).

On a healthy, periodic checkpointing cluster, each edit file should not be higher than ~2-5MB, and thus the overall space footprint of keeping these edits around is never high to cause any concern. Also situation has never occurred where these defaults values needed to be lowered

Unnecessary edits (those beyond the retain number configurations) are only purged upon each successful checkpointing at the active namenode, which purges the local NameNode edits files and asks the Journal Nodes to purge their edits file. So if a checkpoint is not occurring, that can cause edits file to be not purged.

These two properties are not commonly changed and therefore are not exposed as separate properties within Cloudera Manager, hence they will need to be added in the NameNode Safety Valve ("NameNode Configuration Safety Valve for hdfs-site.xml") and requires Namenode restart.

<property>
<name>dfs.namenode.num.extra.edits.retained</name>
<value>1000000</value>
</property>

<property>
<name>dfs.namenode.max.extra.edits.segments.retained</name>
<value>10000</value>
</property>

View solution in original post

9een · ‎03-21-2024