<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Namenodes fail starting on HA cluster - Fatals exist in Journalnode logs in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190267#M59027</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;In my HA cluster, both of my namenodes fail to start.&lt;/P&gt;&lt;P&gt;Centos 7.3&lt;/P&gt;&lt;P&gt;Ambari 2.4.2&lt;/P&gt;&lt;P&gt;HDP 2.5.3&lt;/P&gt;&lt;P&gt;Ambari stderr:&lt;/P&gt;&lt;PRE&gt;2017-04-06 10:49:49,039 - Getting jmx metrics from NN failed. URL: &lt;A href="http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem" target="_blank"&gt;http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem&lt;/A&gt;
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl -s 'http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1&amp;gt;/tmp/tmp0CNZmD 2&amp;gt;/tmp/tmpRAZgwz' returned 7. 

2017-04-06 10:49:51,041 - Getting jmx metrics from NN failed. URL: &lt;A href="http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem" target="_blank"&gt;http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem&lt;/A&gt;
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl -s 'http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1&amp;gt;/tmp/tmp_hLNY7 2&amp;gt;/tmp/tmpoCOTt8' returned 7. 
...
(tries several times and then)
...
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 420, in &amp;lt;module&amp;gt;
    NameNode().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 101, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 184, in namenode
    if is_this_namenode_active() is False:
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapper
    return function(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 562, in is_this_namenode_active
    raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))
resource_management.core.exceptions.Fail: The NameNode nn1 is not listed as Active or Standby, waiting...
&lt;/PRE&gt;&lt;P&gt;Ambari stdout:&lt;/P&gt;&lt;PRE&gt;2017-04-06 10:53:20,521 - call returned (255, '17/04/06 10:53:20 INFO ipc.Client: Retrying connect to server: master03.mydomain.local/10.0.109.21:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\n17/04/06 10:53:20 WARN ipc.Client: Failed to connect to server: master03.mydomain.local/10.0.109.21:8020: retries get failed due to exceeded maximum allowed retries number: 1
2017-04-06 10:53:20,522 - No active NameNode was found after 5 retries. Will return current NameNode HA states
&lt;/PRE&gt;&lt;P&gt;Namenode log:&lt;/P&gt;&lt;PRE&gt;2017-04-06 10:11:43,561FATALError: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [10.0.109.20:8485, 10.0.109.21:8485, 10.0.109.22:8485], stream=null)) java.lang.AssertionError: Decided to synchronize log to startTxId: 1 endTxId: 1 isInProgress: true but logger 10.0.109.20:8485 had seen txid 1865764 committed at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:336) at (some class at some other class at ...)&lt;/PRE&gt;&lt;P&gt;Some more logs from Namenode(s):&lt;/P&gt;&lt;PRE&gt;2017-04-05 17:13:42,894 WARN  client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal 10.0.109.22:8
485 failed to write txns 1865765-1865765. Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 13 is not the current writer epoch  0
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:459)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolSer
verSideTranslatorPB.java:158)


2017-04-06 10:11:42,380 INFO  ipc.Server (Server.java:logException(2401)) - IPC Server handler 72 on 8020, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.sendHeartbeat from 9.1.10.14:37173 Call#2322 Retry#0
org.apache.hadoop.ipc.RetriableException: NameNode still not started
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkNNStartup(NameNodeRpcServer.java:2057)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1414)
        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServerSideTranslatorPB.java:118)
        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:29064)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-04-06 10:11:42,390 INFO  namenode.NameNode (NameNode.java:startCommonServices(876)) - NameNode RPC up at: bigm02.etstur.local/9.1.10.21:8020
2017-04-06 10:11:42,391 INFO  namenode.FSNamesystem (FSNamesystem.java:startStandbyServices(1286)) - Starting services required for standby state
2017-04-06 10:11:42,393 INFO  ha.EditLogTailer (EditLogTailer.java:&amp;lt;init&amp;gt;(117)) - Will roll logs on active node at bigm03.etstur.local/9.1.10.22:8020 every 120 seconds.
2017-04-06 10:11:42,397 INFO  ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at &lt;A href="http://bigm03.etstur.local:50070" target="_blank"&gt;http://bigm03.etstur.local:50070&lt;/A&gt;
Serving checkpoints at &lt;A href="http://bigm02.etstur.local:50070" target="_blank"&gt;http://bigm02.etstur.local:50070&lt;/A&gt;
2017-04-06 10:11:43,371 INFO  namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(1329)) - Stopping services started for standby state
2017-04-06 10:11:43,372 WARN  ha.EditLogTailer (EditLogTailer.java:doWork(349)) - Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:347)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:476)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
2017-04-06 10:11:43,475 INFO  namenode.FSNamesystem (FSNamesystem.java:startActiveServices(1130)) - Starting services required for active state
2017-04-06 10:11:43,485 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(435)) - Starting recovery process for unclosed journal segments...
2017-04-06 10:11:43,534 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(437)) - Successfully started new epoch 17
2017-04-06 10:11:43,535 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(263)) - Beginning recovery of unclosed segment starting at txid 1
2017-04-06 10:11:43,557 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(272)) - Recovery prepare phase complete. Responses:
9.1.10.20:8485: segmentState { startTxId: 1 endTxId: 1 isInProgress: true } lastWriterEpoch: 14 lastCommittedTxId: 1865764
9.1.10.21:8485: segmentState { startTxId: 1 endTxId: 1 isInProgress: true } lastWriterEpoch: 14 lastCommittedTxId: 1865764
2017-04-06 10:11:43,560 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(296)) - Using longest log: 9.1.10.20:8485=segmentState {
  startTxId: 1
  endTxId: 1
  isInProgress: true
}
lastWriterEpoch: 14
lastCommittedTxId: 1865764


2017-04-06 10:11:43,561 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [9.1.10.20:8485, 9.1.10.21:8485, 9.1.10.22:8485], stream=null))
java.lang.AssertionError: Decided to synchronize log to startTxId: 1
endTxId: 1
isInProgress: true
 but logger 9.1.10.20:8485 had seen txid 1865764 committed
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:336)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:455)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1459)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1139)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1915)
        at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1783)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1631)
        at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-04-06 10:11:43,562 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2017-04-06 10:11:43,563 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at bigm02.etstur.local/9.1.10.21
************************************************************/

&lt;/PRE&gt;&lt;P&gt;And although Journal Nodes started succesfully, they have following error which also can be found suspicious:&lt;/P&gt;&lt;PRE&gt;2017-04-05 17:15:05,653 ERROR RECEIVED SIGNAL 15: SIGTERM&lt;/PRE&gt;&lt;P&gt;Funny thing is order of execution. There are ZKFailoverController process, JournalNode(s) and Namenode. I always thougth that Namenode is the latest one to start and which of them would be active was already determined by ZKFailoverController process. As I follow the logs, both of the namenodes check each other. What is this for? Even ZKFailoverController wants to reach Namenodes before itself gets started.&lt;/P&gt;&lt;P&gt;And in case you want to know, the backgroud of this error is as following...&lt;/P&gt;&lt;P&gt;Yesterday I noticed that one of the datanodes failed and stopped. There was following errors in the logs:&lt;/P&gt;&lt;PRE&gt;2017-04-05 15:50:11,168 ERROR datanode.DataNode (BPServiceActor.java:run(752)) - Initialization failed for Block pool &amp;lt;registering&amp;gt; (Datanode Uuid be2286f5-00d7-4758-b89a-45e2304cabe3) service to master02.mydomain.local/10.0.109.23:8020. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:596) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1483) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1448) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:267) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:740) at java.lang.Thread.run(Thread.java:745) 2017-04-05 15:50:11,168 ERROR datanode.DataNode (BPServiceActor.java:run(752)) - Initialization failed for Block pool &amp;lt;registering&amp;gt; (Datanode Uuid be2286f5-00d7-4758-b89a-45e2304cabe3) service to master02.mydomain.local/10.0.109.23:8020. Exiting. org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 13, volumes configured: 14, volumes failed: 1, volume failures tolerated: 0

2017-04-05 17:15:36,968 INFO  common.Storage (Storage.java:tryLock(774)) - Lock on /grid/13/hadoop/hdfs/data/in_use.lock
 acquired by nodename 31353@data02.mydomain.local

&lt;/PRE&gt;&lt;P&gt;Although seeing volume errors, I was able to browse /grid/13/&lt;/P&gt;&lt;P&gt;So I wanted to try following answers in this stackoverflow question:&lt;/P&gt;&lt;P&gt;&lt;A href="http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly" target="_blank"&gt;http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly&lt;/A&gt;&lt;/P&gt;&lt;P&gt;First I deleted data folder under /grid/13/hadoop/hdfs (/grid/13/hadoop/hdfs/data) and tried to start datanode.&lt;/P&gt;&lt;P&gt;It failed again with same errors so I went with namenode format. My cluster was new and empty so I am fine with any solution including formats:&lt;/P&gt;&lt;PRE&gt;./hdfs namenode -format -clusterId &amp;lt;myClusterId&amp;gt;&lt;/PRE&gt;&lt;P&gt;After this format, one of the namenodes failed. I tried to restart all HDFS services then both of namenodes failed.&lt;/P&gt;&lt;P&gt;Any comments appreciated.&lt;/P&gt;</description>
    <pubDate>Thu, 06 Apr 2017 15:54:40 GMT</pubDate>
    <dc:creator>sedatkestepe</dc:creator>
    <dc:date>2017-04-06T15:54:40Z</dc:date>
    <item>
      <title>Namenodes fail starting on HA cluster - Fatals exist in Journalnode logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190267#M59027</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;In my HA cluster, both of my namenodes fail to start.&lt;/P&gt;&lt;P&gt;Centos 7.3&lt;/P&gt;&lt;P&gt;Ambari 2.4.2&lt;/P&gt;&lt;P&gt;HDP 2.5.3&lt;/P&gt;&lt;P&gt;Ambari stderr:&lt;/P&gt;&lt;PRE&gt;2017-04-06 10:49:49,039 - Getting jmx metrics from NN failed. URL: &lt;A href="http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem" target="_blank"&gt;http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem&lt;/A&gt;
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl -s 'http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1&amp;gt;/tmp/tmp0CNZmD 2&amp;gt;/tmp/tmpRAZgwz' returned 7. 

2017-04-06 10:49:51,041 - Getting jmx metrics from NN failed. URL: &lt;A href="http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem" target="_blank"&gt;http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem&lt;/A&gt;
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl -s 'http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1&amp;gt;/tmp/tmp_hLNY7 2&amp;gt;/tmp/tmpoCOTt8' returned 7. 
...
(tries several times and then)
...
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 420, in &amp;lt;module&amp;gt;
    NameNode().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 101, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 184, in namenode
    if is_this_namenode_active() is False:
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapper
    return function(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 562, in is_this_namenode_active
    raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))
resource_management.core.exceptions.Fail: The NameNode nn1 is not listed as Active or Standby, waiting...
&lt;/PRE&gt;&lt;P&gt;Ambari stdout:&lt;/P&gt;&lt;PRE&gt;2017-04-06 10:53:20,521 - call returned (255, '17/04/06 10:53:20 INFO ipc.Client: Retrying connect to server: master03.mydomain.local/10.0.109.21:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\n17/04/06 10:53:20 WARN ipc.Client: Failed to connect to server: master03.mydomain.local/10.0.109.21:8020: retries get failed due to exceeded maximum allowed retries number: 1
2017-04-06 10:53:20,522 - No active NameNode was found after 5 retries. Will return current NameNode HA states
&lt;/PRE&gt;&lt;P&gt;Namenode log:&lt;/P&gt;&lt;PRE&gt;2017-04-06 10:11:43,561FATALError: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [10.0.109.20:8485, 10.0.109.21:8485, 10.0.109.22:8485], stream=null)) java.lang.AssertionError: Decided to synchronize log to startTxId: 1 endTxId: 1 isInProgress: true but logger 10.0.109.20:8485 had seen txid 1865764 committed at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:336) at (some class at some other class at ...)&lt;/PRE&gt;&lt;P&gt;Some more logs from Namenode(s):&lt;/P&gt;&lt;PRE&gt;2017-04-05 17:13:42,894 WARN  client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal 10.0.109.22:8
485 failed to write txns 1865765-1865765. Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 13 is not the current writer epoch  0
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:459)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolSer
verSideTranslatorPB.java:158)


2017-04-06 10:11:42,380 INFO  ipc.Server (Server.java:logException(2401)) - IPC Server handler 72 on 8020, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.sendHeartbeat from 9.1.10.14:37173 Call#2322 Retry#0
org.apache.hadoop.ipc.RetriableException: NameNode still not started
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkNNStartup(NameNodeRpcServer.java:2057)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1414)
        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServerSideTranslatorPB.java:118)
        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:29064)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-04-06 10:11:42,390 INFO  namenode.NameNode (NameNode.java:startCommonServices(876)) - NameNode RPC up at: bigm02.etstur.local/9.1.10.21:8020
2017-04-06 10:11:42,391 INFO  namenode.FSNamesystem (FSNamesystem.java:startStandbyServices(1286)) - Starting services required for standby state
2017-04-06 10:11:42,393 INFO  ha.EditLogTailer (EditLogTailer.java:&amp;lt;init&amp;gt;(117)) - Will roll logs on active node at bigm03.etstur.local/9.1.10.22:8020 every 120 seconds.
2017-04-06 10:11:42,397 INFO  ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at &lt;A href="http://bigm03.etstur.local:50070" target="_blank"&gt;http://bigm03.etstur.local:50070&lt;/A&gt;
Serving checkpoints at &lt;A href="http://bigm02.etstur.local:50070" target="_blank"&gt;http://bigm02.etstur.local:50070&lt;/A&gt;
2017-04-06 10:11:43,371 INFO  namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(1329)) - Stopping services started for standby state
2017-04-06 10:11:43,372 WARN  ha.EditLogTailer (EditLogTailer.java:doWork(349)) - Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:347)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:476)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
2017-04-06 10:11:43,475 INFO  namenode.FSNamesystem (FSNamesystem.java:startActiveServices(1130)) - Starting services required for active state
2017-04-06 10:11:43,485 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(435)) - Starting recovery process for unclosed journal segments...
2017-04-06 10:11:43,534 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(437)) - Successfully started new epoch 17
2017-04-06 10:11:43,535 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(263)) - Beginning recovery of unclosed segment starting at txid 1
2017-04-06 10:11:43,557 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(272)) - Recovery prepare phase complete. Responses:
9.1.10.20:8485: segmentState { startTxId: 1 endTxId: 1 isInProgress: true } lastWriterEpoch: 14 lastCommittedTxId: 1865764
9.1.10.21:8485: segmentState { startTxId: 1 endTxId: 1 isInProgress: true } lastWriterEpoch: 14 lastCommittedTxId: 1865764
2017-04-06 10:11:43,560 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(296)) - Using longest log: 9.1.10.20:8485=segmentState {
  startTxId: 1
  endTxId: 1
  isInProgress: true
}
lastWriterEpoch: 14
lastCommittedTxId: 1865764


2017-04-06 10:11:43,561 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [9.1.10.20:8485, 9.1.10.21:8485, 9.1.10.22:8485], stream=null))
java.lang.AssertionError: Decided to synchronize log to startTxId: 1
endTxId: 1
isInProgress: true
 but logger 9.1.10.20:8485 had seen txid 1865764 committed
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:336)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:455)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1459)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1139)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1915)
        at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1783)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1631)
        at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-04-06 10:11:43,562 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2017-04-06 10:11:43,563 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at bigm02.etstur.local/9.1.10.21
************************************************************/

&lt;/PRE&gt;&lt;P&gt;And although Journal Nodes started succesfully, they have following error which also can be found suspicious:&lt;/P&gt;&lt;PRE&gt;2017-04-05 17:15:05,653 ERROR RECEIVED SIGNAL 15: SIGTERM&lt;/PRE&gt;&lt;P&gt;Funny thing is order of execution. There are ZKFailoverController process, JournalNode(s) and Namenode. I always thougth that Namenode is the latest one to start and which of them would be active was already determined by ZKFailoverController process. As I follow the logs, both of the namenodes check each other. What is this for? Even ZKFailoverController wants to reach Namenodes before itself gets started.&lt;/P&gt;&lt;P&gt;And in case you want to know, the backgroud of this error is as following...&lt;/P&gt;&lt;P&gt;Yesterday I noticed that one of the datanodes failed and stopped. There was following errors in the logs:&lt;/P&gt;&lt;PRE&gt;2017-04-05 15:50:11,168 ERROR datanode.DataNode (BPServiceActor.java:run(752)) - Initialization failed for Block pool &amp;lt;registering&amp;gt; (Datanode Uuid be2286f5-00d7-4758-b89a-45e2304cabe3) service to master02.mydomain.local/10.0.109.23:8020. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:596) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1483) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1448) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:267) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:740) at java.lang.Thread.run(Thread.java:745) 2017-04-05 15:50:11,168 ERROR datanode.DataNode (BPServiceActor.java:run(752)) - Initialization failed for Block pool &amp;lt;registering&amp;gt; (Datanode Uuid be2286f5-00d7-4758-b89a-45e2304cabe3) service to master02.mydomain.local/10.0.109.23:8020. Exiting. org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 13, volumes configured: 14, volumes failed: 1, volume failures tolerated: 0

2017-04-05 17:15:36,968 INFO  common.Storage (Storage.java:tryLock(774)) - Lock on /grid/13/hadoop/hdfs/data/in_use.lock
 acquired by nodename 31353@data02.mydomain.local

&lt;/PRE&gt;&lt;P&gt;Although seeing volume errors, I was able to browse /grid/13/&lt;/P&gt;&lt;P&gt;So I wanted to try following answers in this stackoverflow question:&lt;/P&gt;&lt;P&gt;&lt;A href="http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly" target="_blank"&gt;http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly&lt;/A&gt;&lt;/P&gt;&lt;P&gt;First I deleted data folder under /grid/13/hadoop/hdfs (/grid/13/hadoop/hdfs/data) and tried to start datanode.&lt;/P&gt;&lt;P&gt;It failed again with same errors so I went with namenode format. My cluster was new and empty so I am fine with any solution including formats:&lt;/P&gt;&lt;PRE&gt;./hdfs namenode -format -clusterId &amp;lt;myClusterId&amp;gt;&lt;/PRE&gt;&lt;P&gt;After this format, one of the namenodes failed. I tried to restart all HDFS services then both of namenodes failed.&lt;/P&gt;&lt;P&gt;Any comments appreciated.&lt;/P&gt;</description>
      <pubDate>Thu, 06 Apr 2017 15:54:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190267#M59027</guid>
      <dc:creator>sedatkestepe</dc:creator>
      <dc:date>2017-04-06T15:54:40Z</dc:date>
    </item>
    <item>
      <title>Re: Namenodes fail starting on HA cluster - Fatals exist in Journalnode logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190268#M59028</link>
      <description>&lt;P&gt;It seems like all the numbers are messed up now.&lt;/P&gt;&lt;P&gt;While not sure I think epoch numbers are:&lt;/P&gt;&lt;P&gt;nn1: 21&lt;/P&gt;&lt;P&gt;nn2: 13&lt;/P&gt;&lt;P&gt;JQM: 0&lt;/P&gt;&lt;P&gt;I think I executed format command in a very critical moment.&lt;/P&gt;&lt;P&gt;And yet there are multiple txn id(s)&lt;/P&gt;&lt;P&gt;JQM(s) says:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Client trying to move committed txid backward from 1865764 to 1&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;One of the namenodes say:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;2017-04-07 13:54:05,483 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: recoverUnfin
alizedSegments failed for required journal (JournalAndStream(mgr=QJM to [10.0.109.20:8485, 10.0.109.21:8485, 10.0.109.22:8485]
, stream=null))
java.lang.AssertionError: Decided to synchronize log to startTxId: 1
endTxId: 1
isInProgress: true
 but logger 10.0.109:8485 had seen txid 1865764 committed&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;I need you guys' help.&lt;/P&gt;&lt;P&gt;Since cluster does not have any data yet, any solution including formatting any data and / or resetting any module is fine.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2017 18:14:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190268#M59028</guid>
      <dc:creator>sedatkestepe</dc:creator>
      <dc:date>2017-04-07T18:14:19Z</dc:date>
    </item>
    <item>
      <title>Re: Namenodes fail starting on HA cluster - Fatals exist in Journalnode logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190269#M59029</link>
      <description>&lt;P&gt;Now, I am able to start all services.&lt;/P&gt;&lt;P&gt;I checked for the namenode files under namenode data directory. ( &amp;lt;dfs.namenode.name.dir&amp;gt; of hdfs-site.xml file). Can also be found on config tab if you are using Ambari.&lt;/P&gt;&lt;P&gt;I checked the file sizes and modified dates of fsimage and edit log files for both namenode hosts.&lt;/P&gt;&lt;P&gt;The command &lt;EM&gt;hdfs&lt;/EM&gt; runs over hdfs client and it runs fine regardless of being on the same host with namenode or not, a.f.a.i.k. Although I know this I did run the command on the host which one of the namenodes installed.&lt;/P&gt;&lt;P&gt;After checking namenode files, I realised that format did not affect the other namenode.&lt;/P&gt;&lt;P&gt;So I tried to run the command also on this node and then started the services.&lt;/P&gt;&lt;P&gt;This way namenodes started.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2017 19:42:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190269#M59029</guid>
      <dc:creator>sedatkestepe</dc:creator>
      <dc:date>2017-04-07T19:42:10Z</dc:date>
    </item>
    <item>
      <title>Re: Namenodes fail starting on HA cluster - Fatals exist in Journalnode logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190270#M59030</link>
      <description>&lt;P&gt;Apparently my previous answer was just a workound.&lt;/P&gt;&lt;P&gt;After the format it looked fine but an alarm definition on Ambari checked for checkpoints and alerted for it.&lt;/P&gt;&lt;P&gt;Alert description was saying that checkpoint was not made for a long time (it alerts after 6 hours by default but was more than 60 hours in my case).&lt;/P&gt;&lt;P&gt;So what happened was; I did format namenodes but QJM was not informed with it. Thus, txid of QJM was still the old value while namenodes had different. As a result QJM did not accept new edits from active namenode, stand by namenode couldn't get any edits from QJM.&lt;/P&gt;&lt;P&gt;After some more searches on google I found below beautiful article:&lt;/P&gt;&lt;P&gt;&lt;A href="https://nicholasmaillard.wordpress.com/2015/07/20/formatting-hdfs/" target="_blank"&gt;https://nicholasmaillard.wordpress.com/2015/07/20/formatting-hdfs/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Article describes right steps to format hdfs for HA enabled clusters as below:&lt;/P&gt;&lt;P&gt;"&lt;EM&gt;The initial steps are very close&lt;/EM&gt;&lt;/P&gt;&lt;OL&gt;
&lt;LI&gt;&lt;EM&gt;Stop the Hdfs service&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;Start only the journal nodes (as they will need to be made aware of the formatting)&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;On the first namenode (as user hdfs)
&lt;/EM&gt;&lt;OL&gt;
&lt;LI&gt;&lt;EM&gt; hadoop namenode -format &lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;hdfs namenode -initializeSharedEdits -force (for the journal nodes)&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;hdfs zkfc -formatZK -force (to force zookeeper to reinitialise)&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;restart that first namenode&lt;/EM&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;On the second namenode
&lt;/EM&gt;&lt;OL&gt;
&lt;LI&gt;&lt;EM&gt;hdfs namenode -bootstrapStandby -force (force synch with first namenode)&lt;/EM&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;On every datanode clear the data directory&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;Restart the HDFS service&lt;/EM&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;"&lt;/P&gt;&lt;P&gt;These steps helped me overcome the issue.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Apr 2017 19:36:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/190270#M59030</guid>
      <dc:creator>sedatkestepe</dc:creator>
      <dc:date>2017-04-11T19:36:38Z</dc:date>
    </item>
    <item>
      <title>Re: Namenodes fail starting on HA cluster - Fatals exist in Journalnode logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/315116#M59031</link>
      <description>&lt;P&gt;Solution provided can be worked on new environment,&amp;nbsp;&lt;/P&gt;&lt;P&gt;If its old environment, then issue can be addressed as below,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NN1 - Started - But its state always in staring - Leave as is and go to NN2&lt;/P&gt;&lt;P&gt;NN2 - Start NN2 - This start will fix this issue&lt;/P&gt;</description>
      <pubDate>Fri, 23 Apr 2021 06:57:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Namenodes-fail-starting-on-HA-cluster-Fatals-exist-in/m-p/315116#M59031</guid>
      <dc:creator>Venkateshmr</dc:creator>
      <dc:date>2021-04-23T06:57:24Z</dc:date>
    </item>
  </channel>
</rss>

