Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

recover from Secondary NN when primary NN fsimage is corrupted

namenode cluster ID is changed so i drop the ../namenode/current dir which contains fsimage and edit logs. now i wants to recover my nn from secondary nn which contains that exact copy of /current dir.

for that what i did is just copy fsimage from SNN to NN. but now when i try to start NN it show to format the namenode.

whants to know that should i format the nn or there is any other method to recover nn.

namenode.log

2017-01-27 01:07:10,184 INFO  util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map NameNodeRetryCache
2017-01-27 01:07:10,184 INFO  util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type       = 64-bit
2017-01-27 01:07:10,184 INFO  util.GSet (LightWeightGSet.java:computeCapacity(356)) - 0.029999999329447746% max memory 2.9 GB = 922.8 KB
2017-01-27 01:07:10,185 INFO  util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity      = 2^17 = 131072 entries
2017-01-27 01:07:10,188 INFO  namenode.NNConf (NNConf.java:<init>(62)) - ACLs enabled? true
2017-01-27 01:07:10,188 INFO  namenode.NNConf (NNConf.java:<init>(66)) - XAttrs enabled? true
2017-01-27 01:07:10,188 INFO  namenode.NNConf (NNConf.java:<init>(74)) - Maximum size of an xattr: 16384
2017-01-27 01:07:10,202 INFO  common.Storage (Storage.java:tryLock(715)) - Lock on /hadoop/hdfs/namenode/in_use.lock acquired by nodename 7391@ip.ec2.internal
2017-01-27 01:07:10,204 WARN  namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(743)) - Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1022)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:741)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:536)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:595)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
2017-01-27 01:07:10,209 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@ip.ec2.internal:50070
2017-01-27 01:07:10,310 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping NameNode metrics system...
2017-01-27 01:07:10,310 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(135)) - ganglia thread interrupted.
2017-01-27 01:07:10,311 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) - NameNode metrics system stopped.
2017-01-27 01:07:10,311 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(605)) - NameNode metrics system shutdown complete.
2017-01-27 01:07:10,311 FATAL namenode.NameNode (NameNode.java:main(1509)) - Failed to start namenode.
java.io.IOException: NameNode is not formatted.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1022)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:741)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:536)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:595)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
2017-01-27 01:07:10,313 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2017-01-27 01:07:10,314 INFO  namenode.NameNode (StringUtils.java:run(659)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip.ec2.internal/nn
************************************************************/


6 REPLIES 6

Expert Contributor

You shouldn't copy the fsimage from SNN to NN manually. Instead, you should backup your corrupt current dir and create an empty one instead. Then leave the SNN fsimage in its directory and start the NameNode with -importCheckpoint option:

hdfs namenode -importCheckpoint

As described here: https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html#Import+Checkpoint

@gnovak this is what i get on running this command

17/01/27 02:51:16 INFO mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@ip.ec2.internal:50070
17/01/27 02:51:16 WARN common.Util: Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
17/01/27 02:51:16 WARN common.Util: Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
17/01/27 02:51:16 WARN namenode.FSNamesystem: !!! WARNING !!!
        The NameNode currently runs without persistent storage.
        Any changes to the file system meta-data may be lost.
        Recommended actions:
                - shutdown and restart NameNode with configured "dfs.namenode.edits.dir.required" in hdfs-site.xml;
                - use Backup Node as a persistent and up-to-date storage of the file system meta-data.
17/01/27 02:51:16 WARN namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
17/01/27 02:51:16 WARN namenode.FSNamesystem: Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories!
17/01/27 02:51:16 WARN common.Util: Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
17/01/27 02:51:16 WARN common.Util: Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
17/01/27 02:51:16 WARN common.Storage: set restore failed storage to true
17/01/27 02:51:16 INFO namenode.FSNamesystem: No KeyProvider found.
 '''
''''
17/01/27 02:51:16 INFO common.Storage: Lock on /hadoop/hdfs/namenode/in_use.lock acquired by nodename 21282@ip-172-31-17-251.ec2.internal
17/01/27 02:51:16 INFO namenode.FSImage: Storage directory /hadoop/hdfs/namenode is not formatted.
17/01/27 02:51:16 INFO namenode.FSImage: Formatting ...
17/01/27 02:51:16 WARN common.Util: Path /mnt/disk1/hadoop/hdfs/namesecondary should be specified as a URI in configuration files. Please update hdfs configuration.
17/01/27 02:51:16 WARN common.Util: Path /mnt/disk1/hadoop/hdfs/namesecondary should be specified as a URI in configuration files. Please update hdfs configuration.
17/01/27 02:51:16 WARN common.Storage: set restore failed storage to true
17/01/27 02:51:16 WARN common.Storage: Storage directory /mnt/disk1/hadoop/hdfs/namesecondary does not exist
17/01/27 02:51:16 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /mnt/disk1/hadoop/hdfs/namesecondary is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:313)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:515)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:271)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1022)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:741)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:536)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:595)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
17/01/27 02:51:16 INFO mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@ip-.ec2.internal:50070
17/01/27 02:51:16 INFO impl.MetricsSystemImpl: Stopping NameNode metrics system...
17/01/27 02:51:16 INFO impl.MetricsSinkAdapter: ganglia thread interrupted.
17/01/27 02:51:16 INFO impl.MetricsSystemImpl: NameNode metrics system stopped.
17/01/27 02:51:16 INFO impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
17/01/27 02:51:16 FATAL namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /mnt/disk1/hadoop/hdfs/namesecondary is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:313)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:515)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:271)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1022)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:741)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:536)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:595)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
17/01/27 02:51:16 INFO util.ExitUtil: Exiting with status 1
17/01/27 02:51:16 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-.ec2.internal/
************************************************************/

Expert Contributor

@Punit kumar It says that the secondary namenode directory (/mnt/disk1/hadoop/hdfs/namesecondary) doesn't exist. Does it? Where did you copy the fsimage from previously?

@gnovak

here is my /mnt/disk1/hadoop/hdfs/namesecondary

total 36
drwxr-xr-x. 2 hdfs hadoop 28672 Jan 26 05:13 current
-rw-r--r--  1 hdfs hadoop    34 Jan 27 02:43 in_use.lock

@gnovak it also have some URI issues on configuration file. should i have to change hdfs-site.xml

Expert Contributor

I don't think the URI issues have anything to do with this. Maybe you should try stopping the secondary namenode before doing the import.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.