Created 06-26-2018 03:53 PM
Hi All,
We were experiencing issue with 4 of data nodes which were not sending the block reports to name node. In order to resolve that issue, we have formatted all the data node dirs for all the data nodes and decommissioned and recommissioned the data nodes.Also deleted all the data from hdfs. when I am trying to format the name node. I am getting below error.
18/06/26 16:32:05 WARN namenode.NameNode: Encountered exception during format: org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown: 10.217.99.13:8485: Cannot lock storage /hadoop/hdfs/journal/HDPDRHA. The directory is already locked at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:743) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:551) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:502) at org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeAndRecoverStorage(JNStorage.java:227) at org.apache.hadoop.hdfs.qjournal.server.JNStorage.<init>(JNStorage.java:76) at org.apache.hadoop.hdfs.qjournal.server.Journal.<init>(Journal.java:143) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:90) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:99) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.isFormatted(JournalNodeRpcServer.java:120) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.isFormatted(QJournalProtocolServerSideTranslatorPB.java:103) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232) at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:965) at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:179) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1185) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1631) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1769) 18/06/26 16:32:05 ERROR namenode.NameNode: Failed to start namenode. org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown: 10.217.99.13:8485: Cannot lock storage /hadoop/hdfs/journal/HDPDRHA. The directory is already locked at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:743) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:551) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:502) at org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeAndRecoverStorage(JNStorage.java:227) at org.apache.hadoop.hdfs.qjournal.server.JNStorage.<init>(JNStorage.java:76) at org.apache.hadoop.hdfs.qjournal.server.Journal.<init>(Journal.java:143) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:90) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:99) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.isFormatted(JournalNodeRpcServer.java:120) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.isFormatted(QJournalProtocolServerSideTranslatorPB.java:103) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232) at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:965) at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:179) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1185) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1631) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1769) 18/06/26 16:32:05 INFO util.ExitUtil: Exiting with status 1 18/06/26 16:32:05 INFO namenode.NameNode: SHUTDOWN_MSG:
I stopped both name nodes ,only journal nodes are online when executing below command.
hadoop namenode -format
Any idea why I am getting above error?.. Thank you so much for your assistance on this.
Created 06-27-2018 04:59 PM
@Geoffrey Shelton Okot, Today, I have managed to format namenode successfully. One of the journal node's meta data was not in sync with other two journal nodes. That 's the reason problematic journal node was getting locked every time when I was trying to format the name node. After copying data from good journal node's directory to problematic journal node 's directory,allowed me to format namenode. I also deleted in_use.lock files from all 3 journal nodes before executing hdfs namenode -format command.
Thank you so much for your assistance on this.
Created 06-26-2018 04:20 PM
Please shutdown on the journalnodes there seem to be an in_use.lock file and restart and retry the format
Created 06-26-2018 07:08 PM
Thank you so much @Geoffrey Shelton Okot for your assistance as always.
I have restarted journal nodes but it did not help. Would it be nice if I delete all the the journal nodes and re add them again or completely delete hdfs services and install namenode and datanodes from scratch?
Please advise if you have any other better way to solve this issue?
Created 06-26-2018 08:19 PM
Created 06-26-2018 08:26 PM
@Geoffrey Shelton Okot, I have got 3 journal nodes and 3 zookeepers. I have removed the locks file & ran hdfs namenode -format. I noticed that in_use.lock file is being created by the namenode -format command.
Created 06-26-2018 08:53 PM
Was the format successful?
Created 06-27-2018 04:59 PM
@Geoffrey Shelton Okot, Today, I have managed to format namenode successfully. One of the journal node's meta data was not in sync with other two journal nodes. That 's the reason problematic journal node was getting locked every time when I was trying to format the name node. After copying data from good journal node's directory to problematic journal node 's directory,allowed me to format namenode. I also deleted in_use.lock files from all 3 journal nodes before executing hdfs namenode -format command.
Thank you so much for your assistance on this.
Created 06-27-2018 05:26 PM
Good to know,give yourself the points then close this thread.