Created 12-13-2016 11:44 PM
I learned recently that Kafka 0.10 supports rack awareness. I researched Kafka wiki and JIRA and got some understanding, but it would be great if any of the HCC experts could provide an example of how replication works with rack awareness.
Created 12-13-2016 11:59 PM
As you pointed-out, Kafka 0.10.0.0 supports rack awareness. KAFKA-1215 added a rack-id to kafka config. You can specify that a broker belongs to a particular rack by adding a property to the broker config: broker.rack=my-rack-id.
The rack awareness feature spreads replicas of the same partition across different racks. This extends the guarantees Kafka provides for broker-failure to cover rack-failure, limiting the risk of data loss should all the brokers on a rack fail at once. The feature can also be applied to other broker groupings such as availability zones in EC2.
Let's assume an example with 6 brokers. Brokers 0,1 and 2 are on the same rack, and brokers 3,4 and 5 are on a separate rack. Instead of picking brokers in the order of 0 to 5, we order them: 0,3,1,4,2,5 - each broker is followed by a broker from a different rack. In this case, if leader for partition 0 is on broker 4, the first replica will be on broker 2 which is on a completely different rack. If the first rack goes offline, we know that we still have a surviving replica and therefore the partition is still available. This will be true for all replicas, so we have guaranteed availability in case of rack failure.
***
If any response was helpful, please vote/accept best answer.
Created 12-13-2016 11:53 PM
Rack awareness for Kafka works similar in principal to HDFS rack awareness. If you are able to define which rack each of your nodes belongs to, then Kafka is able to intelligently allocate replicas on nodes that do not share the same rack. This gives you better fault tolerance. If a rack goes down due to maintenance or power loss, you have a reduced chance that a leader and all of the replicas are located in that single rack.
Obviously, this feature is only beneficial if you are able to spread your Kafka brokers across racks. Without rack awareness, Kafka has no way to know which nodes are in a common rack. That means it is possible for all of the brokers and their replicas for a topic could be taken offline when a single rack goes down.
Created 06-08-2017 06:12 PM
Hi,
I am trying to understand the impact and design for zookeeper setup since Kafka is dependent on zookeeper for its operations.
Zookeeper specifies 2F+1 no of nodes to be setup for reliable fault tolerance. Consider that If I have 2 racks and I setup 4 nodes on rack A and 5 on rack B (Total 9 zookeeper nodes) and rack B goes down (5 zookeeper nodes goes down). In that case with the requirement of 2F+1, it needs 11 zookeeper nodes where as I have only 9 nodes. So zookeeper in case of rack failure with higher no of nodes will not be able to sustain which will impact Kafka cluster behavior.
Can you please provide your inputs on how to better setup zookeeper so that Kafka can work seamlessly in case of 2 rack infrastructure
Created 06-26-2020 05:32 AM
@techsoln were you able to implement this design? Can you please share your experience? Am more curious to know how did you manage to get the Zookeeper design.
Created 06-26-2020 08:27 AM
@Kapardjh, As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.
Regards,
Vidya Sargur,Created 12-13-2016 11:57 PM
@Andi Sonde , not sure if you came across the following in your research: http://kafka.apache.org/documentation.html#basic_ops_racks.
Created 12-13-2016 11:59 PM
As you pointed-out, Kafka 0.10.0.0 supports rack awareness. KAFKA-1215 added a rack-id to kafka config. You can specify that a broker belongs to a particular rack by adding a property to the broker config: broker.rack=my-rack-id.
The rack awareness feature spreads replicas of the same partition across different racks. This extends the guarantees Kafka provides for broker-failure to cover rack-failure, limiting the risk of data loss should all the brokers on a rack fail at once. The feature can also be applied to other broker groupings such as availability zones in EC2.
Let's assume an example with 6 brokers. Brokers 0,1 and 2 are on the same rack, and brokers 3,4 and 5 are on a separate rack. Instead of picking brokers in the order of 0 to 5, we order them: 0,3,1,4,2,5 - each broker is followed by a broker from a different rack. In this case, if leader for partition 0 is on broker 4, the first replica will be on broker 2 which is on a completely different rack. If the first rack goes offline, we know that we still have a surviving replica and therefore the partition is still available. This will be true for all replicas, so we have guaranteed availability in case of rack failure.
***
If any response was helpful, please vote/accept best answer.
Created 12-14-2016 12:04 AM
@Michael Young,@lgeorge,@Constantin Stanca
Wow! That was quick! Thanks so much all of you. I voted-up all your responses and choose Constantin's for the example which was very explicit and easy to follow.
Created 06-08-2017 05:58 PM
Hi,
I am trying to understand the impact and design for zookeeper setup since Kafka is dependent on zookeeper for its operations.
Zookeeper specifies 2F+1 no of nodes to be setup for reliable fault tolerance. Consider that If I have 2 racks and I setup 4 nodes on rack A and 5 on rack B (Total 9 zookeeper nodes) and rack B goes down (5 zookeeper nodes goes down). In that case with the requirement of 2F+1, it needs 11 zookeeper nodes where as I have only 9 nodes. So zookeeper in case of rack failure with higher no of nodes will not be able to sustain which will impact Kafka cluster behavior.
Can you please provide your inputs on how to better setup zookeeper so that Kafka can work seamlessly in case of 2 rack infrastructure.