Support Questions

Find answers, ask questions, and share your expertise

kafka + what chould be the root cause for Consumer group is rebalancing

avatar

We have Hadoop cluster that include `datanode` machines and `5 kafka` machines

Kafka machines are installed as part of hortonworks packages , `kafka` version is 0.1X

We run on `datanode` the `deeg_data` applications as executors that consuming data from `kafka` topics

So applications `deeg_data` are consuming data from topics partitions that exists on kafka cluster ( `deeg_data` used `kafka` client for consuming )

On last days we saw that our application – `deeg_data` are failed and we start to find the root cause

On `kafka` cluster we see the following behavior

 

/usr/hdp/current/kafka-broker/bin/kafka-consumer-groups.sh --group deeg_data --describe --bootstrap-server kafka1:6667
To enable GC log rotation, use -Xloggc:<filename> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<num_of_files>
where num_of_file > 0
GC log rotation is turned off
Consumer group ‘deeg_data’ is rebalancing


from `kafka` side `kafka` cluster is healthy and all topics are balanced and all kafka brokers are up and signed correctly to zookeeper

After some time ( couple hours ) , we run again the following , but without the errors about - `Consumer group ‘deeg_data’ is rebalancing`

And we get the following correctly results

/usr/hdp/current/kafka-broker/bin/kafka-consumer-groups.sh --group deeg_data --describe --bootstrap-server kafka1:6667
To enable GC log rotation, use -Xloggc:<filename> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<num_of_files>
where num_of_file > 0
GC log rotation is turned off
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER
deeg_data pot.sdr.proccess 0 6397256247 6403318505 6062258 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 1 6397329465 6403390955 6061490 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 2 6397314633 6403375153 6060520 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 3 6397258695 6403320788 6062093 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 4 6397316230 6403378448 6062218 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 5 6397325820 6403388053 6062233 consumer-1_/10.3.6.237.
.
.
.

 


So we want to understand why we get:

Consumer group ‘deeg_data’ is rebalancing


What is the reason for above state , and why we get `rebalancing`

Michael-Bronson
1 REPLY 1

avatar
Expert Contributor

@mike_bronson7 

 

In kafka 0.1x we will see this statement (Consumer group ‘deeg_data’ is rebalancing) when the group is rebalancing but in newer versions, we will see something like:

 

GROUP           TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID     HOST            CLIENT-ID
GroupName            topicName       0          0               0               0               -               -               -

Which means no active consumers in this group (or rebalancing).

 

Regarding rebalancing of a group this can be triggered for multiple reasons, but mostly because of:

 

1. A new consumer is added/joined to the group

2. A consumer was removed from the group (because of client shutdown, timeout, network glitches)

3. Timeout issues between brokers/client

 

To get more details about consumers rebalancing (if no errors from the broker side) checking the application log files might provide some details about the underlying issue.