Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Can anyone explain Kafka rack awareness feature?

avatar
Contributor

I learned recently that Kafka 0.10 supports rack awareness. I researched Kafka wiki and JIRA and got some understanding, but it would be great if any of the HCC experts could provide an example of how replication works with rack awareness.

1 ACCEPTED SOLUTION

avatar
Super Guru

@Andi Sonde

As you pointed-out, Kafka 0.10.0.0 supports rack awareness. KAFKA-1215 added a rack-id to kafka config. You can specify that a broker belongs to a particular rack by adding a property to the broker config: broker.rack=my-rack-id.

The rack awareness feature spreads replicas of the same partition across different racks. This extends the guarantees Kafka provides for broker-failure to cover rack-failure, limiting the risk of data loss should all the brokers on a rack fail at once. The feature can also be applied to other broker groupings such as availability zones in EC2.

Let's assume an example with 6 brokers. Brokers 0,1 and 2 are on the same rack, and brokers 3,4 and 5 are on a separate rack. Instead of picking brokers in the order of 0 to 5, we order them: 0,3,1,4,2,5 - each broker is followed by a broker from a different rack. In this case, if leader for partition 0 is on broker 4, the first replica will be on broker 2 which is on a completely different rack. If the first rack goes offline, we know that we still have a surviving replica and therefore the partition is still available. This will be true for all replicas, so we have guaranteed availability in case of rack failure.

***

If any response was helpful, please vote/accept best answer.

View solution in original post

8 REPLIES 8

avatar
Super Guru

@Andi Sonde

Rack awareness for Kafka works similar in principal to HDFS rack awareness. If you are able to define which rack each of your nodes belongs to, then Kafka is able to intelligently allocate replicas on nodes that do not share the same rack. This gives you better fault tolerance. If a rack goes down due to maintenance or power loss, you have a reduced chance that a leader and all of the replicas are located in that single rack.

Obviously, this feature is only beneficial if you are able to spread your Kafka brokers across racks. Without rack awareness, Kafka has no way to know which nodes are in a common rack. That means it is possible for all of the brokers and their replicas for a topic could be taken offline when a single rack goes down.

avatar
Explorer

Hi,

I am trying to understand the impact and design for zookeeper setup since Kafka is dependent on zookeeper for its operations.

Zookeeper specifies 2F+1 no of nodes to be setup for reliable fault tolerance. Consider that If I have 2 racks and I setup 4 nodes on rack A and 5 on rack B (Total 9 zookeeper nodes) and rack B goes down (5 zookeeper nodes goes down). In that case with the requirement of 2F+1, it needs 11 zookeeper nodes where as I have only 9 nodes. So zookeeper in case of rack failure with higher no of nodes will not be able to sustain which will impact Kafka cluster behavior.

Can you please provide your inputs on how to better setup zookeeper so that Kafka can work seamlessly in case of 2 rack infrastructure

avatar
New Contributor

@techsoln were you able to implement this design? Can you please share your experience? Am more curious to know how did you manage to get the Zookeeper design.

avatar
Community Manager

@Kapardjh, As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Super Collaborator

@Andi Sonde , not sure if you came across the following in your research: http://kafka.apache.org/documentation.html#basic_ops_racks.

avatar
Super Guru

@Andi Sonde

As you pointed-out, Kafka 0.10.0.0 supports rack awareness. KAFKA-1215 added a rack-id to kafka config. You can specify that a broker belongs to a particular rack by adding a property to the broker config: broker.rack=my-rack-id.

The rack awareness feature spreads replicas of the same partition across different racks. This extends the guarantees Kafka provides for broker-failure to cover rack-failure, limiting the risk of data loss should all the brokers on a rack fail at once. The feature can also be applied to other broker groupings such as availability zones in EC2.

Let's assume an example with 6 brokers. Brokers 0,1 and 2 are on the same rack, and brokers 3,4 and 5 are on a separate rack. Instead of picking brokers in the order of 0 to 5, we order them: 0,3,1,4,2,5 - each broker is followed by a broker from a different rack. In this case, if leader for partition 0 is on broker 4, the first replica will be on broker 2 which is on a completely different rack. If the first rack goes offline, we know that we still have a surviving replica and therefore the partition is still available. This will be true for all replicas, so we have guaranteed availability in case of rack failure.

***

If any response was helpful, please vote/accept best answer.

avatar
Contributor

@Michael Young,@lgeorge,@Constantin Stanca

Wow! That was quick! Thanks so much all of you. I voted-up all your responses and choose Constantin's for the example which was very explicit and easy to follow.

avatar
Explorer

Hi,

I am trying to understand the impact and design for zookeeper setup since Kafka is dependent on zookeeper for its operations.

Zookeeper specifies 2F+1 no of nodes to be setup for reliable fault tolerance. Consider that If I have 2 racks and I setup 4 nodes on rack A and 5 on rack B (Total 9 zookeeper nodes) and rack B goes down (5 zookeeper nodes goes down). In that case with the requirement of 2F+1, it needs 11 zookeeper nodes where as I have only 9 nodes. So zookeeper in case of rack failure with higher no of nodes will not be able to sustain which will impact Kafka cluster behavior.

Can you please provide your inputs on how to better setup zookeeper so that Kafka can work seamlessly in case of 2 rack infrastructure.