Member since
10-21-2015
7
Posts
0
Kudos Received
0
Solutions
10-29-2015
10:17 AM
Thanks that explains why the patch was not applied! Any explanation (or a link where I can find the info) on what can cause a file to be under construction?
... View more
10-29-2015
08:12 AM
We had to patch manually the jar to run the namenode again. Then we were able to remove the problematic file. Here is the chain of event : - The secondary namenode tried to do a checkpoint but failed due to nodes under construction ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Unable to save image for /data/1/dfs/nn
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getINodesUnderConstruction(LeaseManager.java:447)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFilesUnderConstruction(FSNamesystem.java:7235)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.serializeFilesUCSection(FSImageFormatPBINode.java:508)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Saver.saveInodes(FSImageFormatProtobuf.java:431)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Saver.saveInternal(FSImageFormatProtobuf.java:474)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Saver.save(FSImageFormatProtobuf.java:410)
at org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:958)
at org.apache.hadoop.hdfs.server.namenode.FSImage$FSImageSaver.run(FSImage.java:1009)
at java.lang.Thread.run(Thread.java:745) - Cloudera manager did warned us, with an email that we tought to be a system problem (disk related). - A bit after that we did a failover. then both namenode refused to start - After looking around we found that it could be somehow related to HDFS-8384 - Since we tought that the patch HDFS-8384 was supposed to be applied to CDH 5.4.7 according to the relase notes, we looked elsewhere for the cause of the problem. - We decided to take a look at the source code of hadoop-hdfs-2.6.0-cdh5.4.7.jar and realized that the patch was not applied - We manually compiled the patch (just the method that was causing problem), repackaged the jar and we where able to restart the namenode, discover the faulty file and get back on our feet. Shall I open a JIRA to mention that HDFS-8384 is not applied to CDH 5.4.7 ? What can cause an INode to be under construction ? Thanks
... View more
10-27-2015
10:52 PM
I've looked at the code provided in hadoop-hdfs-2.6.0-cdh5.4.7.jar synchronized long getNumUnderConstructionBlocks() {
assert this.fsnamesystem.hasReadLock() : "The FSNamesystem read lock wasn't"
+ "acquired before counting under construction blocks";
long numUCBlocks = 0;
for (Lease lease : sortedLeases) {
for (String path : lease.getPaths()) {
final INodeFile cons;
try {
cons = this.fsnamesystem.getFSDirectory().getINode(path).asFile();
Preconditions.checkState(cons.isUnderConstruction());
} catch (UnresolvedLinkException e) {
throw new AssertionError("Lease files should reside on this FS");
}
BlockInfo[] blocks = cons.getBlocks();
if(blocks == null)
continue;
for(BlockInfo b : blocks) {
if(!b.isComplete())
numUCBlocks++;
}
}
}
LOG.info("Number of blocks under construction: " + numUCBlocks);
return numUCBlocks;
} And it looks like the patch from HDFS-8384 was not applied to CDH 5.4.7 ??, the commit of the patch is here : https://github.com/apache/hadoop/commit/8928729c80af0a154524e06fb13ed9b191986a78
... View more
10-27-2015
10:37 PM
We are running a CDH 5.4.7 cluster and after an automatic failover both Namename node now refuse to start. Output : Failed to start namenode.
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:119)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6339)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1149)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:677)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:663)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:810)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:794)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1487)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1553)
2015-10-28 01:07:56,579 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 It looks similar to https://issues.apache.org/jira/browse/HDFS-8384 But we can see that it is supposed to be fixed in 5.3.8 : http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_rn_fixed_in_538.htmlWe are not able to run hadoop namenode -recover with the same stack trace. 15/10/28 01:33:39 INFO namenode.FSImage: Save namespace 15/10/28 01:33:43 ERROR namenode.FSImage: Unable to save image for /data/1/dfs/nn
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getINodesUnderConstruction(LeaseManager.java:447)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFilesUnderConstruction(FSNamesystem.java:7264)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.serializeFilesUCSection(FSImageFormatPBINode.java:508)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Saver.saveInodes(FSImageFormatProtobuf.java:431)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Saver.saveInternal(FSImageFormatProtobuf.java:474)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Saver.save(FSImageFormatProtobuf.java:410)
at org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:958)
at org.apache.hadoop.hdfs.server.namenode.FSImage$FSImageSaver.run(FSImage.java:1009)
at java.lang.Thread.run(Thread.java:745) Is there any workaround ?
... View more
Labels:
- Labels:
-
HDFS