Created 05-12-2024 11:29 PM
HDFS cluster in HA enabled with one active namenode and one standby namenode. During checkpointing in how editlogs and fsimage get merged ?
I tired with the source code check but got confused by the flow, Can someone help on this?, how editlogs and fsimage file get merged ?
Created 05-14-2024 06:44 AM
Hi @NaveenBlaze,
I am trying to explain in a summarized view but not sure if this helps you otherwise let me know what details u want around checkpointing.
1. During startup, the Standby NameNode loads the latest filesystem image from the disk into its memory. It also retrieves and applies any remaining edit log files from the Journal nodes to update its namespace.
2. This merging process happens in memory each time, ensuring that the namespace stays up-to-date with the latest changes.
3. After startup, the Standby NameNode regularly checks for new edit log files every dfs.ha.tail-edits.period (default: 60 seconds). It streams any new edits directly from the Journal nodes into memory to keep the namespace updated.
4. Additionally, the Standby NameNode checks every dfs.namenode.checkpoint.check.period (default: 60 seconds) to see if a certain number of un-checkpointed transactions have been reached (default: 1,000,000).
5. If the number of un-checkpointed transactions hasn't reached the threshold within dfs.namenode.checkpoint.period (default: 3600 seconds or 6 hours), the Standby NameNode performs a mandatory checkpoint by saving all accumulated namespace changes from memory to disk (saveNamespace).
6. After the checkpoint, the Standby NameNode requests the Active NameNode to fetch the newly built filesystem image. The Active NameNode streams it and saves it to disk for future restarts.
7. It's important to note that the edit logs stored in the Journal nodes serve as the primary source of truth during startup for both the Active and Standby NameNodes.
Created 05-13-2024 02:37 AM
@NaveenBlaze, Welcome to our community! To help you get the best possible answer, I have tagged in our HDFS experts @Asok @willx who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Created 05-13-2024 02:44 AM
@VidyaSargur Thanks.
@Asok @willx
Can you please help me with how the transactions in edit logs are applied to fsimage in Standby namenode ?
Created 05-13-2024 06:51 AM
Hi @NaveenBlaze,
Thanks for raising the question. Please refer to the below article for the flow of checkpointing in HDFS.
https://blog.cloudera.com/a-guide-to-checkpointing-in-hadoop/
Regards,
Will Xiao,
Cloudera support
Created 05-13-2024 10:48 PM
I have seen this post, I understood how the flow is happening, but how the edit logs transactions are applied to fsimage is not mentioned.
Can you please clear that ?
And also, how block report is generated in datanode and how it is handled in namenode side?
Created 05-14-2024 06:44 AM
Hi @NaveenBlaze,
I am trying to explain in a summarized view but not sure if this helps you otherwise let me know what details u want around checkpointing.
1. During startup, the Standby NameNode loads the latest filesystem image from the disk into its memory. It also retrieves and applies any remaining edit log files from the Journal nodes to update its namespace.
2. This merging process happens in memory each time, ensuring that the namespace stays up-to-date with the latest changes.
3. After startup, the Standby NameNode regularly checks for new edit log files every dfs.ha.tail-edits.period (default: 60 seconds). It streams any new edits directly from the Journal nodes into memory to keep the namespace updated.
4. Additionally, the Standby NameNode checks every dfs.namenode.checkpoint.check.period (default: 60 seconds) to see if a certain number of un-checkpointed transactions have been reached (default: 1,000,000).
5. If the number of un-checkpointed transactions hasn't reached the threshold within dfs.namenode.checkpoint.period (default: 3600 seconds or 6 hours), the Standby NameNode performs a mandatory checkpoint by saving all accumulated namespace changes from memory to disk (saveNamespace).
6. After the checkpoint, the Standby NameNode requests the Active NameNode to fetch the newly built filesystem image. The Active NameNode streams it and saves it to disk for future restarts.
7. It's important to note that the edit logs stored in the Journal nodes serve as the primary source of truth during startup for both the Active and Standby NameNodes.
Created 05-14-2024 10:00 PM
Hi @Majeti ,
Thanks for your response.
2. This merging process happens in memory each time, ensuring that the namespace stays up-to-date with the latest changes.
In this step you have mentioned merging process, how that merging process is happening between edit logs and old fsimage ?
Created 05-24-2024 02:34 AM
Hi @NaveenBlaze ,
You can get more info from https://github.com/c9n/hadoop/blob/master/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/h... .
Notice these two lines in this method doTailEdits
FSImage image = namesystem.getFSImage();
streams = editLog.selectInputStreams(lastTxnId + 1, 0, null, false);
editsLoaded = image.loadEdits(streams, namesystem);