- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
NN stopped and cannot recover with error "There appears to be a gap in the edit log"
Created on ‎11-14-2013 07:13 AM - edited ‎09-16-2022 01:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
I deployed a single node for CDH and CM for testing, however once I added some services like Hub, the NN stopped and cannot start it with eror: There appears to be a gap in the edit log.
2013-11-14 15:00:01,431 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete. 2013-11-14 15:00:01,432 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: There appears to be a gap in the edit log. We expected txid 8364, but got txid 27381. at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:158) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:92) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:744) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:660) at org.apache.hadoop.hdfs.server.namenode.FSImage.doUpgrade(FSImage.java:349) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:261) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:639) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:476) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:403) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:437) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:613) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:598) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233) 2013-11-14 15:00:01,445 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2013-11-14 15:00:01,448 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ubcdh/10.0.0.4 ************************************************************/
I tied to run "./bin/hadoop namenode -recover" however it was returned another error:
13/11/14 14:52:17 INFO hdfs.StateChange: STATE* Safe mode is ON. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
However the command to leave safemode with error:
safemode: Call From ubcdh/10.0.0.4 to ubcdh:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
All my services are running porperly and only one node configured, so I don't think the connection is in failure status.
How can I get this issue fixed?
Created ‎11-14-2013 01:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your edit log transactions are missing in the namenode metadata directory....
"We expected txid 8364, but got txid 27381" ---this mean the edit logs/transactions from 8364 to 27381 are missing....If you don't want to loose the data restore the edit logs from back up metadata if you have or just point the namenode to txid 27381, you may loose data if you point namenode to txid 27381.
Created ‎11-14-2013 07:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@JUNXIONG I have moved this post to the HDFS discussion board since this is an HDFS specific issue. Hopefully someone in here can assist you.
Regards
Created ‎11-14-2013 01:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your edit log transactions are missing in the namenode metadata directory....
"We expected txid 8364, but got txid 27381" ---this mean the edit logs/transactions from 8364 to 27381 are missing....If you don't want to loose the data restore the edit logs from back up metadata if you have or just point the namenode to txid 27381, you may loose data if you point namenode to txid 27381.
Created ‎11-14-2013 07:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎11-15-2013 07:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think this happens when you try to format the namenode....
Created ‎04-28-2015 09:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Would someone be kind enough to describe the process of "just point the namenode to txid 27381"? I've been having this issue on a fresh HDFS cluster, and this sounds like an easy way to fix it.
Created ‎02-22-2017 06:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mythobeast,
Do you have the procedure to resolve this issue?
Created ‎09-09-2017 12:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to point namenode particular txid ?
Created ‎02-22-2017 06:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I have the same error, " we expected txid xxxx , but got yyyy"
How do I change the pointer of the namenode?
Please advice.
Thanks
