Support Questions

Find answers, ask questions, and share your expertise

"ForceFromStart" option in Storm-Kafka 1.0

avatar
Contributor

Hi All,

We have upgraded our environment from HDP 2.3.4 to HDP 2.4 and so review Storm topologies in order to use 1.0.1 version and new features. In old version, storm-kafka library (0.9.2) included the option "forceFromStart" in SpoutConfig that restore data from beginning (if set to true) or get only current value (if set to false) by kafka.

In last version the option is removed and I see that the spout get always data from beginning.

How we can replace the functionality in new version? I mean, I want get only the current value from spout, how I can proceed to do it?

Thanks in advance, Giuseppe

1 ACCEPTED SOLUTION

avatar
Master Mentor

Storm 1.0.1 is only available in HDP 2.5 so I'm wondering whether you have issues with dependencies. according to the latest documentation for 2.5 forceFeomStart is still there and defaults to false. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_storm-component-guide/content/storm-kafka...

View solution in original post

9 REPLIES 9

avatar
Master Mentor

Storm 1.0.1 is only available in HDP 2.5 so I'm wondering whether you have issues with dependencies. according to the latest documentation for 2.5 forceFeomStart is still there and defaults to false. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_storm-component-guide/content/storm-kafka...

avatar
Contributor

Thank you Artem! Yes, I'm working with 2.5, I was wrong to write above...

The dependencies are ok but the KafkaConfig don't have "forceFromStart" properties. In fact, I've checked the KafkaConfig sources and it's missing, please see

https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/Kafk...

So, it could be a refuse in documentation. In this case, how I can use this properties if it is not present in kafkaConfig?

avatar
Master Mentor

I will follow up with engineering on this but for now, can you try the following

kafka.api.OffsetRequest.LatestTime() 

that should only grab latest messages

avatar
Master Mentor
@Giuseppe Maldarizzi

looks like you're correct, forceFromStart is removed from Storm in favor of EarliestTime/LatestTime, it is a problem with documentation, I will speak to docs team to remove that and replace with correct info. The associated JIRAs are https://issues.apache.org/jira/browse/STORM-563 and STORM-650

avatar
Contributor

Thanks you Artem for clarification, I will proceed using new approach.

avatar
Master Mentor

@Giuseppe Maldarizzi once you confirm it works, please accept the answer to close the thread.

avatar
Contributor

Yes, it work, we can close the thread. Thanks again

avatar
Master Mentor

@Giuseppe Maldarizzi just heard back from engineering, also please look at ignoreZKOffsets parameter in place of forceFromStart, documentation will be updated. https://github.com/apache/storm/tree/master/external/storm-kafka#how-kafkaspout-stores-offsets-of-a-...

avatar
Contributor

Perfect, I will try it. Thank you again