question Re: NameNode and Standbynamenode auto stopped in Support Questions

NameNode and Standbynamenode auto stopped

vinayk — Fri, 16 Mar 2018 17:02:26 GMT

Hi,

I'm having 1 master namenode, 1 standby namenode and 4 data node, and 3 journal node.

Before enable HA of Name node , Cluster was working fine. When we enabled HA of NAme node we are facing problem of auto stopped Name node. Below logs find on name node:

2018-03-16 05:49:34,799 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1648)) - BLOCK* neededReplications = 0, pendingReplications = 0. 2018-03-16 05:49:35,684 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 16048 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so far: [10.10.20.5:8485] 2018-03-16 05:49:36,686 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 17050 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so far: [10.10.20.5:8485] 2018-03-16 05:49:37,688 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 18052 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so far: [10.10.20.5:8485] 2018-03-16 05:49:37,850 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1648)) - BLOCK* neededReplications = 0, pendingReplications = 0. 2018-03-16 05:49:38,690 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 19054 ms (timeout=20000 ms) for a response for sendEdits. Succeeded so far: [10.10.20.5:8485]

2018-03-16 05:49:39,637 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [10.10.20.5:8485, 10.10.20.15:8485, 10.10.20.13:8485], stream=QuorumOutputStream starting at txid 5667564)) java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond. at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137) at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:707) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:641) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3722) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:912) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:548) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) 2018-03-16 05:49:39,638 WARN client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 5667564 2018-03-16 05:49:39,647 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2018-03-16 05:49:39,663 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:

Please assist..!!!

Re: NameNode and Standbynamenode auto stopped

Shelton — Fri, 16 Mar 2018 18:01:33 GMT

@Vinay K

How many zookeepers do you have up and running?

Make sure all the 3 JournalManager are running too.

Re: NameNode and Standbynamenode auto stopped

vinayk — Fri, 16 Mar 2018 18:25:18 GMT

3 zookeeper are running and 3 journalmanager are running.

Re: NameNode and Standbynamenode auto stopped

Shelton — Fri, 16 Mar 2018 20:54:00 GMT

@Vinay K

It seems the disk the journals are on is just too busy.

If it's not a big and production cluster can you reboot it. Can you also adjust the following in the two files and retry!

hdfs-site.xml

dfs.qjournal.start-segment.timeout.ms = 90000 
dfs.qjournal.select-input-streams.timeout.ms = 90000 
dfs.qjournal.write-txns.timeout.ms = 90000

core-site.xml

ipc.client.connect.timeout = 90000

Please revert

Re: NameNode and Standbynamenode auto stopped

vinayk — Mon, 19 Mar 2018 13:57:38 GMT

I had defined these configuration. I got Namenode shutdown automatically while Standbynamenode was running today.

Re: NameNode and Standbynamenode auto stopped

Shelton — Mon, 19 Mar 2018 15:17:26 GMT

@Vinay K

Can you confirm that the Active and Standby Namenodes were running normally and then suddenly the primary went down? Is there something, in particular, you did before the shutdown?

Is the time on the QJM's servers in sync? Is this a test cluster? If so and you don't risk losing date then you could proceed with
stopping the Hdfs service Start only the journal nodes (as they will need to be made aware of the formatting)

On the first namenode (as user hdfs)

$ hadoop namenode -format hdfs namenode -initializeSharedEdits -force

For the journal nodes

$ hdfs zkfc -formatZK -force

To force zookeeper to reinitialise) restart that first namenode.

On the second namenode

$ hdfs namenode -bootstrapStandby -force

(force synch with first namenode)

On every datanode clear the data directory Restart the HDFS service This was a very simple step by step guide to formatting.

Re: NameNode and Standbynamenode auto stopped

sandeepksaini — Mon, 19 Mar 2018 18:11:57 GMT

@Vinay K

Problem:

There are two problem as per the logs shared :

1) Active NameNode is rolls the edit logs in it's local disk and by using RPC call it send that to journal node. ANN using flush function like the Native call happens in Unix/Linux write to journal nodes. So, here the fatal error shown in log file could be due to:

a) As per log problem is with only one journal node i.e. (10.10.20.5). Might be Journal node process is not running on host. You can check them using :

ps -eaf| grep journal

b) RPC ports not actively listening on journal node, check them using below command:

netstat -ntlp |grep  8485

c) Stop firewalld/Iptable services on the ANN(Active namenode), also on Journal node. to make sure these are not blocking the RPC call.

systemctl stop firewalld

d) Another probable cause could be your disk is heavily busy on that specific JN which is resulting in time out. Check that using iostat command in linux/unix.

iostat

And check for the disk i/o where your edit logs are being saved.

2) The second error is due to the problem with this journal node (10.10.20.5). Once you rectify the problem with this journal node I think you will be sorted.

Also, one more thing to add please check if the time on all journal node are same and in sync. If you have NTP service running on your server please check if NTP server is picking up the right time.

You can check the time on these node using date command:

date

Re: NameNode and Standbynamenode auto stopped

vinayk — Mon, 19 Mar 2018 19:16:54 GMT

Hi,

There was no activity performed before its shutdown. Its normally running. Before going to shutdown, i found some logs on primary node:

2018-03-17 06:51:06,409 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream 'http://slave1.bd-ds.com:8480 /getJournal?jid=bddscluster&segmentTxId=6051162&storageInfo=-63%3A1566454145%3A0%3ACID-26a3ddc9-57c2-49c8-848d-3e28567cd7e7' to transaction ID 6051162 2018-03-17 06:51:06,417 INFO namenode.FSImage (FSEditLogLoader.java:loadFSEdits(145)) - Edits file http://slave1.bd-ds.com:8480/getJournal?jid=bddscluster&segmentTxId= 6051162&storageInfo=-63%3A1566454145%3A0%3ACID-26a3ddc9-57c2-49c8-848d-3e28567cd7e7, http://slave2.bd-ds.com:8480/getJournal?jid=bddscluster&segmentTxId=6051162&storage Info=-63%3A1566454145%3A0%3ACID-26a3ddc9-57c2-49c8-848d-3e28567cd7e7, http://slave6.bd-ds.com:8480/getJournal?jid=bddscluster&segmentTxId=6051162&storageInfo=-63%3A1566 454145%3A0%3ACID-26a3ddc9-57c2-49c8-848d-3e28567cd7e7 of size 47892 edits # 300 loaded in 0 seconds 2018-03-17 06:51:06,417 INFO ha.EditLogTailer (EditLogTailer.java:doTailEdits(275)) - Loaded 300 edits starting from txid 6051161 2018-03-17 06:53:06,436 INFO ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(323)) - Triggering log roll on remote NameNode 2018-03-17 06:53:26,458 WARN ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(339)) - Unable to trigger a roll of the active NN java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: Call From slave0.bd-ds.com/10.10.20.7 to slave1.bd-ds.com:8020 failed on socket timeout except ion: java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.10.2 0.7:51810 remote=slave1.bd-ds.com/10.10.20.12:8020]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:327) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:386) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:368) Caused by: java.net.SocketTimeoutException: Call From slave0.bd-ds.com/10.10.20.7 to slave1.bd-ds.com:8020 failed on socket timeout exception: java.net.SocketTimeoutExc eption: 20000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.10.20.7:51810 remote=slave1.bd-ds. com/10.10.20.12:8020]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout

Some zooKeeper logs are:

2018-03-17 06:55:19,565 WARN zookeeper.ClientCnxn (ClientCnxn.java:run(1146)) - Session 0x1622ffcd77e0000 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 2018-03-17 06:55:21,461 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server 10.10.20.12/10.10.20.12:2181. Will not attempt to authenticate using SASL (unknown error) 2018-03-17 06:55:22,284 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 2618ms for sessionid 0x1622ffcd 77e0000, closing socket connection and attempting reconnect 2018-03-17 06:55:22,392 FATAL ha.ActiveStandbyElector (ActiveStandbyElector.java:fatalError(695)) - Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retryin g further znode monitoring connection errors. 2018-03-17 06:55:22,393 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:terminateConnection(835)) - Terminating ZK connection for elector id=324457684 appData= 0a0b62646473636c757374657212036e6e311a10736c617665302e62642d64732e636f6d20d43e28d33e cb=Elector callbacks for NameNode at slave0.bd-ds.com/10.10.20.7:8020 2018-03-17 06:55:23,193 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x1622ffcd77e0000 closed 2018-03-17 06:55:23,193 FATAL ha.ZKFailoverController (ZKFailoverController.java:fatalError(374)) - Fatal error occurred:Received stat error from Zookeeper. code:CONNEC TIONLOSS. Not retrying further znode monitoring connection errors. 2018-03-17 06:55:23,193 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down 2018-03-17 06:55:23,194 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 8019 2018-03-17 06:55:23,194 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2018-03-17 06:55:23,194 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:quitElection(406)) - Yielding from election 2018-03-17 06:55:23,194 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 8019 2018-03-17 06:55:23,195 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:terminateConnection(832)) - terminateConnection, zkConnectionState = TERMINATED 2018-03-17 06:55:23,195 INFO ha.HealthMonitor (HealthMonitor.java:shutdown(151)) - Stopping HealthMonitor thread 2018-03-17 06:55:23,196 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:terminateConnection(832)) - terminateConnection, zkConnectionState = TERMINATED 2018-03-17 06:55:23,196 FATAL tools.DFSZKFailoverController (DFSZKFailoverController.java:main(193)) - Got a fatal error, exiting now java.lang.RuntimeException: ZK Failover Controller failed: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection err ors. at org.apache.hadoop.ha.ZKFailoverController.mainLoop(ZKFailoverController.java:369) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:238) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:61) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:191) 2018-03-17 06:55:23,205 INFO tools.DFSZKFailoverController (LogAdapter.java:info(45)) - SHUTDOWN_MSG:

These are Dev server. Some data held on these server.

Time of QJM server is in sync.

Re: NameNode and Standbynamenode auto stopped

Shelton — Mon, 19 Mar 2018 19:44:41 GMT

@Vinay K

I can see 2 things all connect to connectivity

Received stat error from Zookeeper. code: CONNECTIONLOSS. Not retrying further znode monitoring connection errors.

Can you validate your ZKFailoverControllers are running

The root cause of a Socket Timeout is a connectivity failure between the machines.

Check the settings: is this the machine you really wanted to talk to?

- From the machine that is raising the exception, can you resolve the hostname. 
- Is that resolved hostname the correct one? Can you ping the remote host? 
- Is the target machine running the relevant Hadoop processes? 
- Can you telnet to the target host and port? 
- Can you telnet to the target host and port from any other machine? 
- On the target machine, can you telnet to the port using localhost as the hostname. If this works but external network connections time out, it's usually a firewall issue.

Please check the above especially the firewall

Re: NameNode and Standbynamenode auto stopped

sandeepksaini — Mon, 19 Mar 2018 21:05:50 GMT

@Vinay K

Thanks for confirming the time sync.

Probable cause could be from your firewall blocking the TCP communication on network or other process on host using the hadoop designated ports.

Reason for the justification: In the first log you shared with us has the same problem of RPC connection and due to which it was not able to roll the edit logs.

Now, you have shared logs of namenode and zoo keeper in which it clearly shows the error message "connection refused". For this type of problem even if you increase your time out to unlimited it is not going to help you.

Check for the below things for me:

a) Check if firewall/iptables are turned off. If not then do turn it off as per the command shared :

systemctl stop firewalld

b) Check if you are able to ping to the slave nodes from master node using ping command:

ping <ip_address>

c) Check if you host is able to resolve the hostname and ip address of the slave nodes(Ping using hostnames of slaves):

Run these commands from:

ANN -> JN

ANN -> ZK

ANN -> SNN

ping <hostname> 
dig -x <ip_address_of_slave>

d) Did you check for the port opened and not being used by any other process? Use the command as shared in my previous post.

Re: NameNode and Standbynamenode auto stopped

vinayk — Tue, 20 Mar 2018 12:13:56 GMT

Hi,

I am able to telnet, ping to target machine and hostname.

It may be problem has resolved. Today HA NN in running state. I have done some changing. Let me share with you all.

1. First i have 3 three zookeeper server running. I had changed the zookeeper location from server 1 to server 4, meanwhile ha.zookeeper.quorum was still server1.zk.com:2181,server2.zk.com:2181,server3.zk.com:2181 even after restart all services.

I have changed it manually to server4.zk.com:2181,server2.zk.com:2181,server3.zk.com:2181.

2. Second i had done some modification in zookeeper configuration zoo.cfg

changed default to syncLimit=15, tickTime=4000 and initLimit=30.

Now it will take 15*4000=60sec to sync.

3. I have define the suggested properties in configuration files:

hdfs-site.xml

dfs.qjournal.start-segment.timeout.ms =90000
dfs.qjournal.select-input-streams.timeout.ms =90000
dfs.qjournal.write-txns.timeout.ms =90000

core-site.xml

ipc.client.connect.timeout =90000

4. All three JournalNode installed on Both Namenode instance and third installed on DataNode Server.

Today it seems everything is working fine.

Re: NameNode and Standbynamenode auto stopped

Shelton — Tue, 20 Mar 2018 17:54:12 GMT

@Vinay K

@Sandeep Kumar

It's great that your problem has been resolved. It isn't normal that someone attributes himself the correct answer when other HCC members contributed to the answer namely Sandeep and I.

Your solution which was suggested by me.

1 Quorum of zookeeper 
3. Changes in the hdfs-site.xml/core-site.xml config 
4 Journal nodes /Sandeep too

So taking account the above I guess someone else merited the point 🙂

Re: NameNode and Standbynamenode auto stopped

vinayk — Tue, 20 Mar 2018 17:59:51 GMT

@Geoffrey Shelton

If you not comfort i can remove 2 point. I'd found 2nd point in zookeeper documentation.

Re: NameNode and Standbynamenode auto stopped

Shelton — Tue, 20 Mar 2018 18:19:54 GMT

@Vinay K

Not important to me though but ethically . I am fine with all 🙂

Re: NameNode and Standbynamenode auto stopped

vinayk — Tue, 20 Mar 2018 18:27:05 GMT

@Geoffrey S. O.

May be that point is not meaningful.

Thanks for help me. You guys spent your precious time in community, That is appreciable.