Community Articles
Find and share helpful community-sourced technical articles
Labels (1)

SYMPTOM: Ambari is showing Alert about a connection failed to the journal node service. Below is the alert -

2016-06-30 18:50:39,865 [CRITICAL] [HDFS] [journalnode_process] (JournalNode Process) Connection failed to http://jn1.example.com:8480 (Execution of 'curl -k --negotiate -u : -b /var/lib/ambari-agent/tmp/cookies/f8ed47d4-f63e-482c-be70-36755387ca4b -c /var/lib/ambari-agent/tmp/cookies/f8ed47d4-f63e-482c-be70-36755387ca4b -w '%{http_code}' http://jn.example.com:8480 --connect-timeout 5 --max-time 7 -o /dev/null 1>/tmp/tmpE9v3mg 2>/tmp/tmpKOSncN' returned 28. % Total % Received % Xferd Average Speed Time Time Time Current 

ERROR: Below are the journal logs

2016-07-01 10:21:29,390
WARN namenode.FSImage (EditLogFileInputStream.java:scanEditLog(350)) - Caught
exception after scanning through 0 ops from
/hadoop/hdfs/journal/phadcluster01/current/edits_inprogress_0000000002510372012
while determining its valid length. Position was 712704 
java.io.IOException: Can't scan a pre-transactional edit log. 
at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LegacyReader.scanOp(FSEditLogOp.java:4959) 
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanNextOp(EditLogFileInputStream.java:245) 
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanEditLog(EditLogFileInputStream.java:346) 
at org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.scanLog(FileJournalManager.java:520) 
at org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:192) 
at org.apache.hadoop.hdfs.qjournal.server.Journal.<init>(Journal.java:152) 
at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:90) 
at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:99) 
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:161) 
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:186) 
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425) 
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) 

ROOT CAUSE: From the log below it seems that the journal node edits were corrupted

2016-07-01 10:21:16,007 WARN namenode.FSImage (EditLogFileInputStream.java:scanEditLog(350)) - Caught exception after scanning through 0 ops from /hadoop/hdfs/journal/phadcluster01/current/edits_inprogress_0000000002510372012 while determining its valid length. Position was 712704 java.io.IOException: Can't scan a pre-transactional edit log. 

RESOLUTION: Below are steps taken to resolve the issue -

1.stopped journal node 
2.backup existing jn directory metadata 
3.copied working edits_inprogress from other JN node 
4.Modified the permission to hdfs:hadoop 
5.Restart the Journal node. 
6.JN started successfully and no more errors are seen in the log. 
342 Views
Don't have an account?
Version history
Revision #:
1 of 1
Last update:
‎12-23-2016 05:25 PM
Updated by:
 
Contributors
Top Kudoed Authors