<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Kafka consumer group lag in one or two partition even without data ingestion into kafka in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79096#M82627</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I observed strange behaviour ingested data around 50 records into sytem when we have data lag at Kafka channel topic. Strangely lag data is flushed but remaining partition reset current-offset to previous offset even though no data in those particular partitions. How it is possible reset consumer current-offset when we are using as flume sinks. please see offset reset to earlier below screenshot.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="when_initially_lag.PNG" style="width: 600px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/4802iEB6FBAD532315187/image-size/large?v=v2&amp;amp;px=999" role="button" title="when_initially_lag.PNG" alt="when_initially_lag.PNG" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="offset_reset_after_ingest_50records.PNG" style="width: 600px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/4803i01C6F5AF9E7C8DDD/image-size/large?v=v2&amp;amp;px=999" role="button" title="offset_reset_after_ingest_50records.PNG" alt="offset_reset_after_ingest_50records.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When usually current offset is going to reset previous/earlier position even though without any restart of flume component (i.e.consumer).&lt;/P&gt;</description>
    <pubDate>Tue, 28 Aug 2018 04:13:55 GMT</pubDate>
    <dc:creator>monica.nicoara</dc:creator>
    <dc:date>2018-08-28T04:13:55Z</dc:date>
    <item>
      <title>Kafka consumer group lag in one or two partition even without data ingestion into kafka</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/78995#M82625</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a strange problem at kafka channel topic like kafka consumer group lag( 15 lacs events) in one or two partition only.I'll give little background aboout problem:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please find the data flow into system as shown below:&lt;/P&gt;&lt;P&gt;data ingestion ==&amp;gt; kafka&amp;nbsp; ABC(topic of 3 parition) ==&amp;gt; flume source (interceptor ) ==&amp;gt; Kafka DEF(topic of 6 partition ) ==&amp;gt; Hive (parquet file)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is three node edge nodes ( kafka, flume) are running in all nodes. Meanwhile we have separate cluster where data lake is running ( HDFC,Hive, HBase, Cloudera manager,Phoenix client)&lt;/P&gt;&lt;P&gt;component versions:&lt;/P&gt;&lt;P&gt;Kafka ==&amp;gt; 0.10.1.1&lt;/P&gt;&lt;P&gt;Flume ==&amp;gt; 1.7&lt;/P&gt;&lt;P&gt;CDH ==&amp;gt; 5.14&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We stopped all flume agent an hour or two hours and started data ingestion around 12lacs per min into kafka ABC topics. we stopped data ingestions into system and started all flume agents (in three nodes). Then data is comming into hive tables. after one or two mins. we observed either one or two or three paritions (assume 1,4 ,6) at DEF topic is showing as LAG is increasing constantly compare with flume source.But strangely kafka ABC (three partition) is working fine and lag is showing equally.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here those three paritions lag is increased without cleared at DEF topics and remaining parition is showing lag is zero. Kafka ABC topic data is cleared and but not at DEF topics of 1,4 and 6 partitions even though data is not coming into system. It did not cleared in next a day or two days also. We restarted kafka and flume couple of times. But sometimes one or two paritions lag is going to cleared, but not all paritions.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please find flume configurations:&lt;/P&gt;&lt;P&gt;------------------------------------------&lt;/P&gt;&lt;P&gt;agent.sources = \&lt;BR /&gt;flume-agent \&lt;/P&gt;&lt;P&gt;# Channels&lt;BR /&gt;agent.channels = \&lt;BR /&gt;kafkachannel \&lt;/P&gt;&lt;P&gt;# Sinks&lt;BR /&gt;agent.sinks = \&lt;BR /&gt;kite-sink-1 kite-sink-2 \&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;# Sources&lt;BR /&gt;agent.sources.flume-agent.channels = kafkachannel&lt;BR /&gt;agent.sources.flume-agent.type = org.apache.flume.source.kafka.KafkaSource&lt;BR /&gt;agent.sources.flume-agent.kafka.bootstrap.servers = example1.host.com:9092,example2.host.com:9092,example3.host.com:9092&lt;BR /&gt;agent.sources.flume-agent.kafka.num.consumer.fetchers = 10&lt;BR /&gt;agent.sources.flume-agent.kafka.topics = abc&lt;BR /&gt;agent.sources.flume-agent.interceptors = abc-interceptor abcinterceptor&lt;BR /&gt;agent.sources.flume-agent.interceptors.abc-parameters-interceptor.type = static&lt;BR /&gt;agent.sources.flume-agent.interceptors.abcinterceptor.type = com.abc.flume.interceptor.ABCinterceptor$Builder&lt;BR /&gt;agent.sources.flume-agent.interceptors.abc-parameters-interceptor.key = flume.avro.schema.url&lt;BR /&gt;agent.sources.flume-agent.interceptors.abc-parameters-interceptor.value = hdfs://myhadoop/abc.avsc&lt;BR /&gt;agent.sources.flume-agent.interceptors.abc-parameters-interceptor.threadNum = 10&lt;BR /&gt;agent.sources.flume-agent.kafka.consumer.security.protocol = PLAINTEXT&lt;BR /&gt;agent.sources.flume-agent.kafka.consumer.group.id = my-group&lt;/P&gt;&lt;P&gt;# Sinks&lt;BR /&gt;agent.sinks.kite-sink-1.channel = kafkachannel&lt;BR /&gt;agent.sinks.kite-sink-1.type = org.apache.flume.sink.kite.DatasetSink&lt;BR /&gt;agent.sinks.kite-sink-1.kite.repo.uri = repo:hive&lt;BR /&gt;agent.sinks.kite-sink-1.kite.dataset.name = abc&lt;BR /&gt;agent.sinks.kite-sink-1.kite.batchSize = 100000&lt;BR /&gt;agent.sinks.kite-sink-1.kite.rollInterval = 30&lt;BR /&gt;agent.sinks.kite-sink-2.channel = kafkachannel&lt;BR /&gt;agent.sinks.kite-sink-2.type = org.apache.flume.sink.kite.DatasetSink&lt;BR /&gt;agent.sinks.kite-sink-2.kite.repo.uri = repo:hive&lt;BR /&gt;agent.sinks.kite-sink-2.kite.dataset.name = abc&lt;BR /&gt;agent.sinks.kite-sink-2.kite.batchSize = 100000&lt;BR /&gt;agent.sinks.kite-sink-2.kite.rollInterval = 30&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;# Channels&lt;BR /&gt;agent.channels.kafkachannel.type = org.apache.flume.channel.kafka.KafkaChannel&lt;BR /&gt;agent.channels.kafkachannel.brokerList = example1.host.com:9092,example2.host.com:9092,example3.host.com:9092&lt;BR /&gt;agent.channels.kafkachannel.kafka.topic = def&lt;BR /&gt;agent.channels.kafkachannel.kafka.consumer.group.id = my-group-kite&lt;BR /&gt;agent.channels.kafkachannel.parseAsFlumeEvent = true&lt;BR /&gt;agent.channels.kafkachannel.kafka.consumer.session.timeout.ms =60000&lt;BR /&gt;agent.channels.kafkachannel.kafka.consumer.request.timeout.ms=70000&lt;BR /&gt;agent.channels.kafkachannel.kafka.consumer.max.poll.records=100000&lt;BR /&gt;agent.channels.kafkachannel.kafka.num.consumer.fetchers = 10&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;kafka server.properties modified than defaults:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;zookeeper.session.timeout.ms=9000&lt;/P&gt;&lt;P&gt;group.max.session.timeout.ms=60000&lt;/P&gt;&lt;P&gt;group.min.session.timeout.ms=6000&lt;/P&gt;&lt;P&gt;inter.broker.protocol.version=0.10.1.1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Kindly help me. What was wrong in my configuration.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:37:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/78995#M82625</guid>
      <dc:creator>Yarra</dc:creator>
      <dc:date>2022-09-16T13:37:45Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka consumer group lag in one or two partition even without data ingestion into kafka</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/78996#M82626</link>
      <description>&lt;P&gt;I did not observe any error message in kafka or flume logs.But, i saw below messages in flume logs&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2018-08-24 04:35:47 DEBUG Fetcher:677 - Discarding stale fetch response for partition def-3 since its offset 395690 does not match the expected offset 394118&lt;BR /&gt;2018-08-24 04:35:47 DEBUG Fetcher:677 - Discarding stale fetch response for partition def-0 since its offset 81880195 does not match the expected offset 81878380&lt;BR /&gt;2018-08-24 04:35:47 DEBUG Fetcher:671 - Ignoring fetched records for partition def-4 since it is no longer fetchable&lt;BR /&gt;&lt;BR /&gt;2018-08-24 04:35:47 DEBUG Fetcher:671 - Ignoring fetched records for partition def-5 since it is no longer fetchable&lt;BR /&gt;2018-08-24 04:35:47 DEBUG Fetcher:671 - Ignoring fetched records for partition def-1 since it is no longer fetchable&lt;/P&gt;&lt;P&gt;2018-08-24 04:35:47 DEBUG Fetcher:677 - Discarding stale fetch response for partition def-3 since its offset 395690 does not match the expected offset 394118&lt;BR /&gt;2018-08-24 04:35:47 DEBUG Fetcher:677 - Discarding stale fetch response for partition def-0 since its offset 81880195 does not match the expected offset 81878380&lt;BR /&gt;2018-08-24 06:11:03 DEBUG Fetcher:677 - Discarding stale fetch response for partition def-4 since its offset 482480 does not match the expected offset 482450&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;No error or exceptions apart from above messages.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sometimes, total data is reached at hive tables.But,lag will be remain same; why flume sink consumed offset is not updating at group coordinator level?&lt;/P&gt;</description>
      <pubDate>Sat, 25 Aug 2018 03:23:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/78996#M82626</guid>
      <dc:creator>Yarra</dc:creator>
      <dc:date>2018-08-25T03:23:19Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka consumer group lag in one or two partition even without data ingestion into kafka</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79096#M82627</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I observed strange behaviour ingested data around 50 records into sytem when we have data lag at Kafka channel topic. Strangely lag data is flushed but remaining partition reset current-offset to previous offset even though no data in those particular partitions. How it is possible reset consumer current-offset when we are using as flume sinks. please see offset reset to earlier below screenshot.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="when_initially_lag.PNG" style="width: 600px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/4802iEB6FBAD532315187/image-size/large?v=v2&amp;amp;px=999" role="button" title="when_initially_lag.PNG" alt="when_initially_lag.PNG" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="offset_reset_after_ingest_50records.PNG" style="width: 600px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/4803i01C6F5AF9E7C8DDD/image-size/large?v=v2&amp;amp;px=999" role="button" title="offset_reset_after_ingest_50records.PNG" alt="offset_reset_after_ingest_50records.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When usually current offset is going to reset previous/earlier position even though without any restart of flume component (i.e.consumer).&lt;/P&gt;</description>
      <pubDate>Tue, 28 Aug 2018 04:13:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79096#M82627</guid>
      <dc:creator>monica.nicoara</dc:creator>
      <dc:date>2018-08-28T04:13:55Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka consumer group lag in one or two partition even without data ingestion into kafka</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79260#M82628</link>
      <description>&lt;P&gt;This is a bug in flume 1.7 version (&lt;A href="https://issues.apache.org/jira/browse/FLUME-3027" target="_blank"&gt;https://issues.apache.org/jira/browse/FLUME-3027&lt;/A&gt;).This issue is resolved with adding code offsets.clear() in below method after consumer.commitSync method otherwise it will create problem when consumer rebalancing time.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;void commitOffsets() {
      this.consumer.commitSync(offsets);
    }&lt;/PRE&gt;&lt;P&gt;Either we have to upgrade to 1.8 of flume or adding offsets.clear() code end of the method.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Srinivas&lt;/P&gt;</description>
      <pubDate>Fri, 31 Aug 2018 11:25:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79260#M82628</guid>
      <dc:creator>Yarra</dc:creator>
      <dc:date>2018-08-31T11:25:57Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka consumer group lag in one or two partition even without data ingestion into kafka</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79272#M82629</link>
      <description>&lt;P&gt;FLUME-3027 has been backported to CDH5.11.0 and above, so if you are able to upgrade, it would prevent the issue of offsets bouncing back and forward.&lt;BR /&gt;&lt;BR /&gt;One thing you may want to consider, if you are getting rebalances, it may be because it is taking too long to deliver by your sink, before polling kafka again. You may want to consider lowering your sink batch size in order to deliver and ack the messages in a timely fashion.&lt;BR /&gt;&lt;BR /&gt;Additionally, if you upgrade to CDH5.14 or higher, the flume kafka client is 0.10.2, and you would be able to set max.poll.records to match the batchSize you are using for the flume sink. Additionally, you could increase the max.poll.interval.ms, which is decoupled from the session.timeout.ms in 0.10.0 and above. This would prevent the rebalancing from occurring since the client would still heartbeat without having to do a poll to pull more records before session.timeout.ms expires.&lt;BR /&gt;&lt;BR /&gt;-pd&lt;/P&gt;</description>
      <pubDate>Fri, 31 Aug 2018 16:02:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kafka-consumer-group-lag-in-one-or-two-partition-even/m-p/79272#M82629</guid>
      <dc:creator>pdvorak</dc:creator>
      <dc:date>2018-08-31T16:02:33Z</dc:date>
    </item>
  </channel>
</rss>

