Support Questions

Find answers, ask questions, and share your expertise

hadoop cluster with active standby namenode + gap in the edit log

avatar

we have ambari cluster , HDP version `2.6.5`

 

cluster include management of two name-node ( one is active and the secondary is standby )

and 65 datanode machines

 

we have problem with the standby name-node that not started and from the namenode logs we can see the following

 

2021-01-01 15:19:43,269 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412.
at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)

 

 

 


for now the active namenode is up but the standby name node is down

 

regarding to

java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412.

 

what is the preferred solution to fix this problem?

 

Capture.PNG

Michael-Bronson
2 ACCEPTED SOLUTIONS

avatar
Master Guru

@mike_bronson7 There is a solution which can help. Run the following command on the Standby NameNode:
# su hdfs -l -c 'hdfs namenode -recover'

Following message can be seen:
You have selected Metadata Recovery mode.  Thismode is intended to recover lost metadata on a corrupt filesystem.  Metadata recovery mode often permanently deletes data from your HDFS filesystem.  Please back up your edit log and fsimage before trying this!   Are you ready to proceed? (Y/N) (Y or N)
To proceed further, select option "yes", the recovery process will read as much of the edit log as possible. When there is an error or an ambiguity, it will prompt how to proceed. There will be further options prompted as Continue, Stop, Quit, and Always.

Mostly the data loss ( due to transaction skip/miss ) is possible when using this method.
This method is therefore not to be used, if  data/transaction losses has to be avoided.


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

avatar
Expert Contributor

Yes @mike_bronson7  above steps also works

View solution in original post

3 REPLIES 3

avatar
Master Guru

@mike_bronson7 There is a solution which can help. Run the following command on the Standby NameNode:
# su hdfs -l -c 'hdfs namenode -recover'

Following message can be seen:
You have selected Metadata Recovery mode.  Thismode is intended to recover lost metadata on a corrupt filesystem.  Metadata recovery mode often permanently deletes data from your HDFS filesystem.  Please back up your edit log and fsimage before trying this!   Are you ready to proceed? (Y/N) (Y or N)
To proceed further, select option "yes", the recovery process will read as much of the edit log as possible. When there is an error or an ambiguity, it will prompt how to proceed. There will be further options prompted as Continue, Stop, Quit, and Always.

Mostly the data loss ( due to transaction skip/miss ) is possible when using this method.
This method is therefore not to be used, if  data/transaction losses has to be avoided.


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar

is the following procedure can help also?

  1. Put Active NN in safemode

    sudo -u hdfs hdfs dfsadmin -safemode enter

  2. Do a savenamespace operation on Active NN

    sudo -u hdfs hdfs dfsadmin -saveNamespace

  3. Leave Safemode

    sudo -u hdfs hdfs dfsadmin -safemode leave

  4. Login to Standby NN

  5. Run below command on Standby namenode to get latest fsimage that we saved in above steps.

    sudo -u hdfs hdfs namenode -bootstrapStandby -force

 

Michael-Bronson

avatar
Expert Contributor

Yes @mike_bronson7  above steps also works