We have Hadoop cluster that include `datanode` machines and `5 kafka` machines
Kafka machines are installed as part of hortonworks packages , `kafka` version is 0.1X
We run on `datanode` the `deeg_data` applications as executors that consuming data from `kafka` topics
So applications `deeg_data` are consuming data from topics partitions that exists on kafka cluster ( `deeg_data` used `kafka` client for consuming )
On last days we saw that our application – `deeg_data` are failed and we start to find the root cause
On `kafka` cluster we see the following behavior
/usr/hdp/current/kafka-broker/bin/kafka-consumer-groups.sh --group deeg_data --describe --bootstrap-server kafka1:6667
To enable GC log rotation, use -Xloggc:<filename> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<num_of_files>
where num_of_file > 0
GC log rotation is turned off
Consumer group ‘deeg_data’ is rebalancing
from `kafka` side `kafka` cluster is healthy and all topics are balanced and all kafka brokers are up and signed correctly to zookeeper
After some time ( couple hours ) , we run again the following , but without the errors about - `Consumer group ‘deeg_data’ is rebalancing`
And we get the following correctly results
/usr/hdp/current/kafka-broker/bin/kafka-consumer-groups.sh --group deeg_data --describe --bootstrap-server kafka1:6667
To enable GC log rotation, use -Xloggc:<filename> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<num_of_files>
where num_of_file > 0
GC log rotation is turned off
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER
deeg_data pot.sdr.proccess 0 6397256247 6403318505 6062258 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 1 6397329465 6403390955 6061490 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 2 6397314633 6403375153 6060520 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 3 6397258695 6403320788 6062093 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 4 6397316230 6403378448 6062218 consumer-1_/10.3.6.237
deeg_data pot.sdr.proccess 5 6397325820 6403388053 6062233 consumer-1_/10.3.6.237.
.
.
.
So we want to understand why we get:
Consumer group ‘deeg_data’ is rebalancing
What is the reason for above state , and why we get `rebalancing`
Michael-Bronson