Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

journal node edit log issue

Solved Go to solution
Highlighted

journal node edit log issue

journal node is logging below WARN in the logs and ambari is alerting about journal web ui is not accessible. any idea how to recover from this ?

2016-10-19 12:36:20,353 WARN  namenode.FSImage (EditLogFileInputStream.java:scanEditLog(359)) - Caught exception after scanning through 0 ops from /hadoop/hdfs/journal/stanleyhotel/current/edits_inprogress_0000000000064985103 while determining its valid length. Position was 888832

java.io.IOException: Can't scan a pre-transactional edit log.

	at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LegacyReader.scanOp(FSEditLogOp.java:4959)

	at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanNextOp(EditLogFileInputStream.java:245)

	at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanEditLog(EditLogFileInputStream.java:355)

	at org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.scanLog(FileJournalManager.java:551)

	at org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:192)

	at org.apache.hadoop.hdfs.qjournal.server.Journal.<init>(Journal.java:152)

	at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:90)

	at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:99)^C

	at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:158)

	at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172)

	at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423)

	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)

	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)

	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)

	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)

	at java.security.AccessController.doPrivileged(Native Method)

	at javax.security.auth.Subject.doAs(Subject.java:422)

	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)

	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)

2016-10-19 12:36:20,353 WARN  namenode.FSImage (EditLogFileInputStream.java:scanEditLog(364)) - After resync, position is 888832
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: journal node edit log issue

Hi @Santhosh B Gowda,

Assuming that this is happening on a single JournalNode then you can try the following:

  1. As a precaution, stop HDFS. This will shut down all Journalnodes as well.
  2. On the node in question, move the fsimage edits directory (/hadoop/hdfs/journal/stanleyhotel/current) to an alternate location.
  3. Copy the fsimage edits directory (/hadoop/hdfs/journal/stanleyhotel/current) from a functioning JournalNode to this node.
  4. Start HDFS.

This should get this Journalnode back inline with the others and get you back to a properly functioning HA state.

View solution in original post

5 REPLIES 5
Highlighted

Re: journal node edit log issue

Hi @Santhosh B Gowda,

Assuming that this is happening on a single JournalNode then you can try the following:

  1. As a precaution, stop HDFS. This will shut down all Journalnodes as well.
  2. On the node in question, move the fsimage edits directory (/hadoop/hdfs/journal/stanleyhotel/current) to an alternate location.
  3. Copy the fsimage edits directory (/hadoop/hdfs/journal/stanleyhotel/current) from a functioning JournalNode to this node.
  4. Start HDFS.

This should get this Journalnode back inline with the others and get you back to a properly functioning HA state.

View solution in original post

Highlighted

Re: journal node edit log issue

@Brandon Wilson Thanks it resolved the problem

Highlighted

Re: journal node edit log issue

Rising Star

Hi @Brandon Wilson

Your solution works perfectly but only if "edits_inprogress_" file has the same name on both JournalNodes (JN).

In case of my devcluster, I was not engaged in the problem of two months. During this time, a healthy JN has created a new "edits_inprogress_" file, but the sick JN still asks the old "edits_inprogress_" file. I did all 4 steps of your algorithm, but sick JN again asks old file. The content of /hadoop/hdfs/journal/devcluster/current is the same on both nodes.

What to do?

Log of healthy JN (edits_inprogress_0000000000016172345)

2017-02-02 10:15:12,513 INFO  namenode.FileJournalManager (FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits file /hadoop/hdfs/journal/devcluster/current/edits_inprogress_0000000000016172345 -> /hadoop/hdfs/journal/devcluster/current/edits_0000000000016172345-0000000000016172394

Log of sick JN (edits_inprogress_0000000000011766543)

2017-02-02 10:15:57,744 WARN  namenode.FSImage (EditLogFileInputStream.java:scanEditLog(350)) - Caught exception after scanning through 0 ops from /hadoop/hdfs/journal/devcluster/current/edits_inprogress_0000000000011766543 while determining its valid length. Position was 1036288
java.io.IOException: Can't scan a pre-transactional edit log.

Re: journal node edit log issue

Rising Star

Solved it! Sick JN didn't stop when I stopped it in Ambari and even when I stop HDFS in Ambari. I killed the JN process manually, replaced the data from healthy JN and run HDFS. Now it works! :)

Highlighted

Re: journal node edit log issue

Expert Contributor

Thanks @Brandon Wilson it worked for me too.

Don't have an account?
Coming from Hortonworks? Activate your account here