Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

NameNode fails due to JournalNode de-sync/timeout

avatar
New Contributor

Hello,

 

We're currently running a fairly busy cluster on CDH 5.7.4 with our NN configured in HA with QJM running 3 JournalNodes. We've been experiencing periodic aborts of the NN for a while, however they seem to have gotten much more common after upgrading from CDH 5.4.10 to 5.7.4, which may or may not be unrelated.

 

The topology looks like this:

  • master1 hosts RM, JN, HBase Master
  • master2 hosts NN, JN
  • master3 hosts RM, JN, NN, HBase Master

 

Our RM and HBase Master also run in HA configuration.

 

The failure mode looks something like this: NN attempts to commit transactions to to the JournalNode quorum. One node (master1) usually commits very quickly, while the the other two typically exhibit one of two failure modes, either a de-sync or a timeout. It does appear that the timeouts and de-syncs are more common on master2 than on master3, however they're fairly common on both. When this happens NN obviously exits and the hot spare NN takes over. We maintain availability, but obviously this is less than ideal from an operational perspective. Checking over the logs, we see lots of complaints about long JN sync times, however we don't see disk saturation in our monitoring for the disks on the affected servers. The volumes in question are RAID1 volumes of two physical disks each.

 

For refence relevant logs are pasted below, with IP addresses scrubbed.

 

NameNode on master2:

2016-10-28 16:25:18,397 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 8003 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so f
ar: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:19,397 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 9004 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so f
ar: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:20,398 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 10005 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:21,399 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 11006 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:22,400 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 12007 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:23,003 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2016-10-28 16:25:23,402 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 13008 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:24,403 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 14010 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:25,404 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 15011 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:26,405 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 16012 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:27,406 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 17013 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:28,408 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 18014 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:29,409 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19015 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so
far: [master1:8485]. Exceptions so far: [master3:8485: Journal disabled until next roll]
2016-10-28 16:25:30,394 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [master1:84
85, master2:8485, master3:8485], stream=QuorumOutputStream starting at txid 23834184641))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.

 

JournalNode on master2:

2016-10-28 16:25:33,430 WARN org.apache.hadoop.hdfs.qjournal.server.Journal: Sync of transaction range 23834653773-23834653915 took 23036ms
2016-10-28 16:25:34,078 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8485: readAndProcess from client master2 threw exception [java.io.IOExceptio
n: Connection reset by peer]
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at org.apache.hadoop.ipc.Server.channelRead(Server.java:2641)
        at org.apache.hadoop.ipc.Server.access$2800(Server.java:134)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1488)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:774)
        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:647)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:618)

JournalNode on master3:

2016-10-28 16:23:57,227 WARN org.apache.hadoop.hdfs.qjournal.server.Journal: Sync of transaction range 23834547943-23834547943 took 13986ms
2016-10-28 16:24:03,020 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from master2:
52766 Call#95308256 Retry#0
org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't write txid 23834590082 expecting nextTxId=23834590042
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:485)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:371)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:149)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
2016-10-28 16:25:43,219 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Updating lastPromisedEpoch from 54 to 55 for client /master3
Who agreed with this topic