Created 01-13-2017 11:58 PM
Storm Version: 0.10.0.2.4
Using a Kafka Spout.
How does storm handle failed tuples?
How many times will storm retry a failed tuple?
What frequency will storm retry the failed tuple?
What is the max tuple count a topology can handle between all spouts and bolts?
Created 01-14-2017 06:59 PM
Hi @Jon Maestas
Answering your questions inline:
How does storm handle failed tuples?
When you are using at least once processing (acking and anchoring) is when Storm will handle tuple failures by retries. Retry means re-emitting a tuple from Spout.
How many times will storm retry a failed tuple?
This depends on the Spout's logic, in case of Kafka Spout for 0.10.x Storm there's the ability for exponential backoff retry (https://github.com/apache/storm/blob/0.10.x-branch/external/storm-kafka/src/jvm/storm/kafka/ExponentialBackoffMsgRetryManager.java)
What frequency will storm retry the failed tuple?
ExponentialBackoff will determine the frequency.
What is the max tuple count a topology can handle between all spouts and bolts?
I am guessing you are asking for maximum number of tuples at any given point can be in Storm's buffers? This = Bolt Count * Executor Count * TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE + Bolt Count * Executor Count * TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE
You can find out the value of these buffers from Ambari -> Storm -> Config -> Search "buffer"
Please note that the above is theoretical maximum, Max Spout Pending (topology.max.spout.pending) throttles the number of in-flight tuples from the Spout.
There's also transfer buffers which will add bit more to the above calculated number.
Please refer to Michael Noll's blog for more details about Storm Buffers (http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/)
Hope this answers your questions.
Created 01-14-2017 06:59 PM
Hi @Jon Maestas
Answering your questions inline:
How does storm handle failed tuples?
When you are using at least once processing (acking and anchoring) is when Storm will handle tuple failures by retries. Retry means re-emitting a tuple from Spout.
How many times will storm retry a failed tuple?
This depends on the Spout's logic, in case of Kafka Spout for 0.10.x Storm there's the ability for exponential backoff retry (https://github.com/apache/storm/blob/0.10.x-branch/external/storm-kafka/src/jvm/storm/kafka/ExponentialBackoffMsgRetryManager.java)
What frequency will storm retry the failed tuple?
ExponentialBackoff will determine the frequency.
What is the max tuple count a topology can handle between all spouts and bolts?
I am guessing you are asking for maximum number of tuples at any given point can be in Storm's buffers? This = Bolt Count * Executor Count * TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE + Bolt Count * Executor Count * TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE
You can find out the value of these buffers from Ambari -> Storm -> Config -> Search "buffer"
Please note that the above is theoretical maximum, Max Spout Pending (topology.max.spout.pending) throttles the number of in-flight tuples from the Spout.
There's also transfer buffers which will add bit more to the above calculated number.
Please refer to Michael Noll's blog for more details about Storm Buffers (http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/)
Hope this answers your questions.
Created 01-16-2017 11:59 PM
@Jon Maestas Please accept the answer if this answered your questions.
Created on 01-23-2017 01:36 AM - edited 08-19-2019 02:47 AM
Good write-up from @Ambud Sharma plus you can visit http://storm.apache.org/releases/1.0.2/Guaranteeing-message-processing.html for info from the source. Additionally, take a peek at the picture below I just exported from our http://hortonworks.com/training/class/hdp-developer-storm-and-trident-fundamentals/ course that might help visualize all of this information.
Good luck and happy Storming!