Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NameNode can not be started

avatar
Super Collaborator

In hdp sandbox, I got the following error in name node log, and name node can not be started

2015-12-21 12:51:27,924 INFO  common.Storage (Storage.java:tryLock(715)) - Lock on /hadoop/hdfs/namenode/in_use.lock acquired by nodename 2311@sandbox.hortonworks.com
2015-12-21 12:51:28,120 INFO  namenode.FileJournalManager (FileJournalManager.java:recoverUnfinalizedSegments(362)) - Recovering unfinalized segments in /hadoop/hdfs/namenode/current
2015-12-21 12:51:28,175 ERROR namenode.FSImage (FSImage.java:loadFSImage(679)) - Failed to load image from FSImageFile(file=/hadoop/hdfs/namenode/current/fsimage_0000000000000006843, cpktTxId=0000000000000006843)
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:221)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:957)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:941)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:740)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:676)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:976)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:607)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:667)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:896)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:880)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1586)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1652)
2015-12-21 12:51:28,223 WARN  namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(683)) - Encountered exception loading fsimage
4 REPLIES 4

avatar
Master Mentor
@jzhang

See this

ERROR namenode.FSImage(FSImage.java:loadFSImage(679))-Failed to load image fromFSImageFile(file=/hadoop/hdfs/namenode/current/fsimage_0000000000000006843, cpktTxId=0000000000000006843)

It looks like that there was abnormal crash caused the loss or file creation issue.

If there is no data in the sandbox and you don't see the above file and there is backup then you can try to format the namenode or import new sandbox image

hadoop namenode -format **It can cause the data loss**

avatar
Expert Contributor

Hi @jzhang can you describe what you were doing with the Sandbox that lead to this behavior?

avatar
New Contributor

Check the size of the file if it is zero bytes then it means there is some corruption with that particular FSImage file. Instead of formatting the namenode which will result in complete data loss, you can just manually remove the FSImage file and then start your namenode with minimal damage to your cluster.

avatar
Master Guru
@jzhang

Please check if your secondary namenode has latest checkpoint? if yes then try to restore fsimage from latest checkpoint.

If you have latest checkpoint from secondary NN

#########

1. Create dfs.name.dir on your new NameNode (it must be empty)

2. Make sure fs.checkpoint.dir is pointed at your last known good copy

3. Start the NameNode with the -importCheckpoint option

4. The NameNode will verify that the files in fs.checkpoint.dir are consistent and create a new copy of the FsImage and EditLog in dfs.name.dir.

Your NameNode should start functioning again and will exit safemode once the appropriate number of blocks have been reported. The NameNode will not alter the files in fs.checkpoint.dir.