Created on 07-13-2017 02:58 PM
When a Kafka cluster is over-subscribed, the loss of a single broker can be a jarring experience for the cluster as a whole. This is especially true when trying to bring a previously failed broker back into a cluster.
In order to help mitigate some of the impact of returning a broker to a cluster when that broker has been out of the cluster for a number of days, removing the broker ID of the broker ready to re-enter the cluster from the Replicas list of all partitions can help.
Generally, you want a Kafka cluster that is sized properly in order to handle single node failures, but as is often the case the size of the use case on the Kafka cluster can quickly start to exceed the physical limitations. In those situations when you're waiting for new hardware to arrive to augment your cluster, you still need to keep the existing cluster working as well as possible.
To that end, there are some AWK scripts that are available on Github that help create the JSON files needed to essentially spoon feed partitions back on to a broker.
This collection of script, which are playfully called Kawkfa, are still alpha at best and have their bugs, but someone may find them useful in the above situation.
The high level procedure is as follows:
Caveats about the scripts:
User | Count |
---|---|
763 | |
379 | |
316 | |
309 | |
270 |