Support Questions

Find answers, ask questions, and share your expertise

I just wondering why should we take odd number(3,5,3,7) of Journal and zookeeper nodes in hadoop cluster...what is rule of (n-1)/ failure?

avatar
Expert Contributor
 
1 ACCEPTED SOLUTION

avatar
Master Guru

The reason is that in a distributed transactional system with a Paxos or similar algorithm you need a quorum ( majority ). Essentially a transaction is committed once more than 50% of nodes say that the transaction is committed.

You could do 4 journalnodes / zookeeper nodes as well but you would not get any benefit over 3 nodes and you add additional overhead. 4 Nodes can only survive 1 failed node because 3 journal-nodes are a majority. But not 2. Therefore you need an uneven number. 3 nodes can survive 1 failure, 5 nodes can survive 2 failures, 7 nodes can survive 3 failures and so on ...

https://en.wikipedia.org/wiki/Paxos_(computer_science)

View solution in original post

4 REPLIES 4

avatar
Master Guru

The reason is that in a distributed transactional system with a Paxos or similar algorithm you need a quorum ( majority ). Essentially a transaction is committed once more than 50% of nodes say that the transaction is committed.

You could do 4 journalnodes / zookeeper nodes as well but you would not get any benefit over 3 nodes and you add additional overhead. 4 Nodes can only survive 1 failed node because 3 journal-nodes are a majority. But not 2. Therefore you need an uneven number. 3 nodes can survive 1 failure, 5 nodes can survive 2 failures, 7 nodes can survive 3 failures and so on ...

https://en.wikipedia.org/wiki/Paxos_(computer_science)

avatar
Expert Contributor

@Benjamin Leonhard

Thanks you ...

avatar

If 3 nodes can survive 1 failure , then how will quorum ( majority ) will meet with 2 running server?

avatar
Super Collaborator

Additional details to Ben Leonhardi's response including the formula for Quorum Calculation:

http://bytecontinnum.com/zookeeper-always-configured-odd-number-nodes/