Support Questions

Find answers, ask questions, and share your expertise

What is the preferred solution for corrupted namenode metadata

avatar

We have ambari cluster , HDP version 2.6.5

 

Cluster include management of two name-node ( one is active and the secondary is standby )

And 65 datanode machines

 

We have problem with the standby name-node that not started and from the namenode logs we Can see the following

 

2021-01-01 15:19:43,269 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412.

from ambari we can see 

 

Capture.PNG


For now the active namenode is up but the standby name node is down , and the root cause for This issue is because **namenode matadata is damaged/corrupted.**

So we have two solution - A or B

 

A)

 

run the following recover on standby namenode

su
hadoop namenode -recover

 

B)

 

Put Active NN in safemode

su hdfs
hdfs dfsadmin -safemode enter

Do a savenamespace operation on Active NN

su hdfs
hdfs dfsadmin -saveNamespace

Leave Safemode

su hdfs
hdfs dfsadmin -safemode leave

Login to Standby NN

Run below command on Standby namenode to get latest fsimage that we saved in above steps.

su hdfs
hdfs namenode -bootstrapStandby -force


what is the preferred solution ( solution A or Solution B ) for our problem? 

Michael-Bronson
1 REPLY 1

avatar
Expert Contributor