Support Questions

Find answers, ask questions, and share your expertise

How client knows zookeeper is lagging behind in transaction?

Can anyone tell me what happens in following scenario

* There are five zookeeper servers s1,s2,s3,s4 and s5

* When client connected to the s3 it was up-to-date

* client made a write request to create /test node to s3 it forwarded to leader(s5)

* As s1,s2,s5 completed that request successfully so,client got the success message

* But due to network problem s3 was not able to complete the work

* And still client is connected with s3

* Now client is making read request for reading /test node to s3

What will happen now, is s3 will throw node not found error or anything else happen?


Super Collaborator

I don't know the details of the zookeeper implementation, but according to the documentation, it guarantuees some things:

  • Sequential Consistency - Updates from a client will be applied in the order that they were sent.
  • Atomicity - Updates either succeed or fail. No partial results.
  • Single System Image - A client will see the same view of the service regardless of the server that it connects to.
  • Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.
  • Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound.

Given those guarantees, s3 will have the correct information in the node /test, once the client received a success message. I am not sure if that means in your example that the client would not get success message.

You probably want to look at Zab for information on how distributed consensus is achieved:

As Harald said, the situation you have described is guaranteed to not happen as this is exactly the kind of problem that ZK was built to prevent.

Yes, dude you are right, zookeeper has the action to prevent this scenario

But my question is what is that action?

How client will come to know that connected zookeeper is lagging behind?

Super Collaborator

the basic idea is, that the zookeeper server will tell the client that he is lagging behind. In your scenario s3 would be a follower of s5 (the leader). So any change is requested to the leader (s5), which will only report successs to the client when all followers have acknowledged the change. But of course in this phase the network between s3 and s5 can fail, and after a timeout, s5 will drop s3 as the follower and still send a success to the client. But at this point s3 also notices that he is no longer a follower, because:

1. the network failed in both directions and s3 has also missed the heartbeat from s5, and therefore now knows that the leader is gone

2. the network connection failed just in the direction s5->s3, then s3 has still processed the update, but s5 will drop s3 as a follower as s5 doesn't receive an acknowledge. It then stops sending heartbeats and sync messages, so for s3 the same situation as in 1. occurs.

Besides this situation the most interesting part is the new leader election when the communication between the current leader and a follower is broken. It is important to ensure that not 2 leaders are elected.