<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Namenode service failure in Active and standby server in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Namenode-service-failure-in-Active-and-standby-server/m-p/240752#M202556</link>
    <description>&lt;P&gt;We are running Hadoop cluster in HA environment. The name node service got failed in active and standby servers.&lt;/P&gt;&lt;P&gt;As per the logs, its failed due to QJM waited for quorum and did not get response then timed out.&lt;/P&gt;&lt;P&gt;Logs from Active Node:&lt;/P&gt;&lt;PRE&gt;2018-12-17 20:57:01,587 WARN  client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 19014 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2018-12-17 20:57:02,574 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [10.106.8.78:8485, 10.106.8.145:8485, 10.106.8.161:8485], stream=QuorumOutputStream starting at txid 62507384))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:654)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4018)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1102)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:630)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2018-12-17 20:57:02,574 WARN  client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 62507384
2018-12-17 20:57:02,587 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(501)) - ==&amp;gt; JVMShutdownHook.run()
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(502)) - JVMShutdownHook: Signalling async audit cleanup to start.
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(477)) - RangerAsyncAuditCleanup: Starting cleanup
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(506)) - JVMShutdownHook: Waiting up to 30 seconds for audit cleanup to finish.
2018-12-17 20:57:02,593 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(281)) - Caught exception in consumer thread. Shutdown might be in progress
2018-12-17 20:57:02,593 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=838857, localBatchBuffer.size()=0
2018-12-17 20:57:02,605 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=837857, localBatchBuffer.size(








2018-12-17 20:57:12,571 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=12859, localBatchBuffer.size()=0
2018-12-17 20:57:12,583 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=11859, localBatchBuffer.size()=0
2018-12-17 20:57:12,592 WARN  util.ShutdownHookManager (ShutdownHookManager.java:run(70)) - ShutdownHook 'JVMShutdownHook' timeout, java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:205)
        at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67)
2018-12-17 20:57:12,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(514)) - JVMShutdownHook: Interrupted while waiting for completion of Async executor!
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
        at org.apache.ranger.audit.provider.AuditProviderFactory$JVMShutdownHook.run(AuditProviderFactory.java:507)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2018-12-17 20:57:12,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(516)) - JVMShutdownHook: Interrupting ranger async audit cleanup thread
2018-12-17 20:57:12,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(518)) - &amp;lt;== JVMShutdownHook.run()
2018-12-17 20:57:12,592 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:stop(106)) - Stop called. name=hdfs.async
2018-12-17 20:57:12,593 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:stop(110)) - Interrupting consumerThread. name=hdfs.async, consumer=hdfs.async.multi_dest
2018-12-17 20:57:12,593 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(481)) - RangerAsyncAuditCleanup: Done cleanup
2018-12-17 20:57:12,593 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(470)) - RangerAsyncAuditCleanup: Waiting to audit cleanup start signal
2018-12-17 20:57:12,593 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:runLogAudit(155)) - Caught exception in consumer thread. Shutdown might be in progress
2018-12-17 20:57:12,596 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:runLogAudit(171)) - Exiting polling loop. name=hdfs.async
2018-12-17 20:57:12,596 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:runLogAudit(175)) - Calling to stop consumer. name=hdfs.async, consumer.name=hdfs.async.multi_dest
2018-12-17 20:57:12,596 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
&lt;/PRE&gt;&lt;P&gt;Logs from StandbyNode:&lt;/P&gt;&lt;PRE&gt;2018-12-17 20:55:17,191 INFO  authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137)) - Authorization successful for yarn/prdhdpmn1.example.com@example.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol
2018-12-17 20:55:17,742 INFO  provider.BaseAuditHandler (BaseAuditHandler.java:logStatus(312)) - Audit Status Log: name=hdfs.async.multi_dest.batch.solr, interval=01:00.001 minutes, events=3, succcessCount=3, totalEvents=41545284, totalSuccessCount=35647876, totalFailedCount=5897408
2018-12-17 20:55:17,743 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:55:17,780 INFO  provider.BaseAuditHandler (BaseAuditHandler.java:logStatus(312)) - Audit Status Log: name=hdfs.async.multi_dest.batch.hdfs, interval=01:00.184 minutes, events=1000, deferredCount=1000, totalEvents=11155910, totalSuccessCount=58910, totalDeferredCount=11097000
2018-12-17 20:55:17,780 INFO  destination.HDFSAuditDestination (HDFSAuditDestination.java:createConfiguration(310)) - Returning HDFS Filesystem Config: Configuration: core-default.xml, core-site.xml, hdfs-default.xml, hdfs-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml
2018-12-17 20:55:17,794 INFO  destination.HDFSAuditDestination (HDFSAuditDestination.java:getLogFileStream(271)) - Checking whether log file exists. hdfPath=hdfs://prdhdpmn1.example.com:8020/ranger/audit/hdfs/20181217/hdfs_ranger_audit_prdhdpmn2.example.com.log, UGI=nn/prdhdpmn2.example.com@example.COM (auth:KERBEROS)
2018-12-17 20:55:17,803 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:handleException(217)) - Exception while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over null. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1979)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1345)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3967)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1130)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:851)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)


        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
        at org.apache.hadoop.ipc.Client.call(Client.java:1496)
        at org.apache.hadoop.ipc.Client.call(Client.java:1396)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:816)
        at sun.reflect.GeneratedMethodAccessor271.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
        at com.sun.proxy.$Proxy34.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2158)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1423)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1419)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1419)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1443)
        at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:273)
        at org.apache.ranger.audit.destination.HDFSAuditDestination.access$000(HDFSAuditDestination.java:44)
        at org.apache.ranger.audit.destination.HDFSAuditDestination$1.run(HDFSAuditDestination.java:159)
        at org.apache.ranger.audit.destination.HDFSAuditDestination$1.run(HDFSAuditDestination.java:156)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:170)
        at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:880)
        at org.apache.ranger.audit.queue.AuditFileSpool.runLogAudit(AuditFileSpool.java:819)
        at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:758)
        at java.lang.Thread.run(Thread.java:745)
2018-12-17 20:55:17,803 ERROR provider.BaseAuditHandler (BaseAuditHandler.java:logError(329)) - Error writing to log file.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1979)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1345)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3967)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1130)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:851)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)


        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
        at org.apache.hadoop.ipc.Client.call(Client.java:1496)
        at org.apache.hadoop.ipc.Client.call(Client.java:1396)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:816)
        at sun.reflect.GeneratedMethodAccessor271.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
        at com.sun.proxy.$Proxy34.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2158)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1423)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1419)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1419)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1443)
        at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:273)
        at org.apache.ranger.audit.destination.HDFSAuditDestination.access$000(HDFSAuditDestination.java:44)
        at org.apache.ranger.audit.destination.HDFSAuditDestination$1.run(HDFSAuditDestination.java:159)
        at org.apache.ranger.audit.destination.HDFSAuditDestination$1.run(HDFSAuditDestination.java:156)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:170)
        at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:880)
        at org.apache.ranger.audit.queue.AuditFileSpool.runLogAudit(AuditFileSpool.java:819)
        at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:758)
        at java.lang.Thread.run(Thread.java:745)
2018-12-17 20:55:17,803 ERROR queue.AuditFileSpool (AuditFileSpool.java:logError(710)) - Error sending logs to consumer. provider=hdfs.async.multi_dest.batch, consumer=hdfs.async.multi_dest.batch.hdfs
2018-12-17 20:55:17,804 INFO  queue.AuditFileSpool (AuditFileSpool.java:runLogAudit(770)) - Destination is down. sleeping for 30000 milli seconds. indexQueue=2, queueName=hdfs.async.multi_dest.batch, consumer=hdfs.async.multi_dest.batch.hdfs
2018-12-17 20:55:20,744 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:55:23,744 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.





2018-12-17 20:56:26,779 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:29,779 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:30,179 ERROR client.RangerAdminRESTClient (RangerAdminRESTClient.java:getServicePoliciesIfUpdated(124)) - Error getting policies. secureMode=true, user=nn/prdhdpmn2.example.com@example.COM (auth:KERBEROS), response={"httpStatusCode":401,"statusCode":0}, serviceName=ABC_hadoop
2018-12-17 20:56:30,179 ERROR util.PolicyRefresher (PolicyRefresher.java:loadPolicyfromPolicyAdmin(255)) - PolicyRefresher(serviceName=ABC_hadoop): failed to refresh policies. Will continue to use last known version of policies (222)
java.lang.Exception: HTTP 401
        at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:126)
        at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:232)
        at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:188)
        at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:158)
2018-12-17 20:56:32,780 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:35,780 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:38,780 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:38,901 INFO  ipc.Server (Server.java:saslProcess(1538)) - Auth successful for jhs/prdhdpmn1.example.com@example.COM (auth:KERBEROS)
2018-12-17 20:56:38,902 INFO  authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137)) - Authorization successful for jhs/prdhdpmn1.example.com@example.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol
2018-12-17 20:56:41,780 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:42,500 INFO  ipc.Server (Server.java:saslProcess(1538)) - Auth successful for ambari-qa-ABC@example.COM (auth:KERBEROS)
2018-12-17 20:56:42,501 INFO  authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137)) - Authorization successful for ambari-qa-ABC@example.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol
2018-12-17 20:56:42,574 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(716)) - Number of transactions: 2 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 2 5
2018-12-17 20:56:43,687 INFO  ipc.Server (Server.java:saslProcess(1538)) - Auth successful for user1@example.COM (auth:KERBEROS)
2018-12-17 20:56:43,687 INFO  authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137)) - Authorization successful for user1@example.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol
2018-12-17 20:56:44,781 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:47,781 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1588)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2018-12-17 20:56:48,574 INFO  client.QuorumJournalManager (QuorumCall.java:waitFor(136)) - Waited 6001 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2018-12-17 20:56:49,576 INFO  client.QuorumJournalManager (QuorumCall.java:waitFor(136)) - Waited 7002 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.






2018-12-17 20:57:00,587 WARN  client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 18013 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2018-12-17 20:57:01,587 WARN  client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 19014 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2018-12-17 20:57:02,574 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [&amp;lt;JN1_IP&amp;gt;:8485, &amp;lt;JN2_IP&amp;gt;:8485, &amp;lt;JN2_IP&amp;gt;:8485], stream=QuorumOutputStream starting at txid 62507384))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:654)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4018)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1102)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:630)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2018-12-17 20:57:02,574 WARN  client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 62507384
2018-12-17 20:57:02,587 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(501)) - ==&amp;gt; JVMShutdownHook.run()
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(502)) - JVMShutdownHook: Signalling async audit cleanup to start.
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(477)) - RangerAsyncAuditCleanup: Starting cleanup
2018-12-17 20:57:02,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(506)) - JVMShutdownHook: Waiting up to 30 seconds for audit cleanup to finish.
2018-12-17 20:57:02,593 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(281)) - Caught exception in consumer thread. Shutdown might be in progress
2018-12-17 20:57:02,593 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=838857, localBatchBuffer.size()=0
2018-12-17 20:57:02,605 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=837857, localBatchBuffer.size()=0

2018-12-17 20:57:12,583 INFO  queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(332)) - Queue is not empty. Will retry. queue.size)=11859, localBatchBuffer.size()=0
2018-12-17 20:57:12,592 WARN  util.ShutdownHookManager (ShutdownHookManager.java:run(70)) - ShutdownHook 'JVMShutdownHook' timeout, java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:205)
        at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67)
2018-12-17 20:57:12,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(514)) - JVMShutdownHook: Interrupted while waiting for completion of Async executor!
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
        at org.apache.ranger.audit.provider.AuditProviderFactory$JVMShutdownHook.run(AuditProviderFactory.java:507)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2018-12-17 20:57:12,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(516)) - JVMShutdownHook: Interrupting ranger async audit cleanup thread
2018-12-17 20:57:12,592 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(518)) - &amp;lt;== JVMShutdownHook.run()
2018-12-17 20:57:12,592 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:stop(106)) - Stop called. name=hdfs.async
2018-12-17 20:57:12,593 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:stop(110)) - Interrupting consumerThread. name=hdfs.async, consumer=hdfs.async.multi_dest
2018-12-17 20:57:12,593 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(481)) - RangerAsyncAuditCleanup: Done cleanup
2018-12-17 20:57:12,593 INFO  provider.AuditProviderFactory (AuditProviderFactory.java:run(470)) - RangerAsyncAuditCleanup: Waiting to audit cleanup start signal
2018-12-17 20:57:12,593 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:runLogAudit(155)) - Caught exception in consumer thread. Shutdown might be in progress
2018-12-17 20:57:12,596 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:runLogAudit(171)) - Exiting polling loop. name=hdfs.async
2018-12-17 20:57:12,596 INFO  queue.AuditAsyncQueue (AuditAsyncQueue.java:runLogAudit(175)) - Calling to stop consumer. name=hdfs.async, consumer.name=hdfs.async.multi_dest
2018-12-17 20:57:12,596 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:





&lt;/PRE&gt;&lt;P&gt;There were no error  in journal node logs.&lt;/P&gt;&lt;P&gt;We could not identify why the QJM failed to get the repose from the journal nodes which caused this failure.&lt;/P&gt;&lt;P&gt;Is there any way to find the root case of this failure ?&lt;/P&gt;</description>
    <pubDate>Wed, 19 Dec 2018 20:09:19 GMT</pubDate>
    <dc:creator>sajesh_purayil</dc:creator>
    <dc:date>2018-12-19T20:09:19Z</dc:date>
  </channel>
</rss>

