Support Questions

Find answers, ask questions, and share your expertise

Zookeeper keeps refusing to connect

avatar
Expert Contributor

We have a cluster with 3 zookeepers running on say servers: s1, s2, and s3. Zookeeper on s3 keeps refusing to connect but not right away after I start it. It will take about 3-4 hours, then it will say the error: Connection failed: [Errno 111] Connection refused to s3.foo.com:2181 in Ambari.

In the zookeeper log, I found this copy pasted below. I have changed the amount of space on the s3 device and there is plenty left. Any ideas?

2017-09-18 19:48:16,333 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1308ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:18,731 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1467ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:20,164 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1431ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:21,524 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1359ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:23,696 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 2170ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:25,416 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1576ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:28,116 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 2699ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:30,369 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 2249ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:31,564 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1195ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:33,884 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1533ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:39,154 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1516ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

2017-09-18 19:48:41,043 - ERROR [SyncThread:3:SyncRequestProcessor@183] - Severe unrecoverable error, exiting

java.io.IOException: No space left on device

at java.io.FileOutputStream.writeBytes(Native Method)

at java.io.FileOutputStream.write(FileOutputStream.java:326)

at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)

at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)

at org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:322)

at org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:322)

at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:491)

at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:196)

at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)

1 REPLY 1

avatar
Expert Contributor

Update: I had 4 flows running at the same time using these Zookeeper instances. When I reduced them from 4 to 2, Zookeeper no longer crashes.