Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Read from sharded rabbitmq and write to partitioned kafka with only once semantics.

avatar

Problem : Read from sharded rabbitmq and write to kafka with only once semantics.

==> I am planning to use Trident for the same.

  • I have used RabbitMq spout (storm) which has implemented BaseRichSpout (https://github.com/ppat/storm-rabbitmq)
    1. I can see that it is a non transactional spout, and hence "would not" support only once semantics of Trident. Am I correct?
  • If I need to build an OpaqueTransactionalSpout for Rabbitmq, Can you give me some hints on how to start or some examples to look into
  • I wanted to write unit test to check if my Trident Topology supports "Exactly once semantic". Are there any sample Unit tests/examples. The reason I want to write the tests because "Exactly once semantics" depend on Spout also not just on Trident or Storm with my understanding.
  • I have to support sharded Rabbitmq and hence how to connect to more than one queue. I am thinking of using Stream.merge(List<Streams>) after creating once stream for each queue, Is this good idea to do?. Let me know your thoughts on the same.
  • https://nathanmarz.github.io/storm/doc/storm/tride...
  • @Ram Sriharsha

    1 ACCEPTED SOLUTION

    avatar
    Explorer
    • Storm does not support the storm-rabbitmq spout that you mentioning, it does not ship with apache storm distro so I have not looked at the code.
    • Here is a JMS spout that you may be able to use as is with RabitMQ if RAbitMQ supports JMS https://github.com/hortonworks/storm/tree/2.3-mai... (Note that this is available in HWX repo but not under apache)
    • For an example of OpaqueSpout take a look at kafka-spout. https://github.com/apache/storm/blob/master/extern...
    • The unit test part for exactly once is kind of hard in my opinion and there are no current examples.
    • I am not sure if I understand the sharing queue's question completely. Your spout could read from N queues while outputting all the messages to a single stream, that is completely up to your spout. The merge can be useful if you want to create one spout -> one queue mapping and then merge all the output streams into a single stream that other processing units (trident states) subscribe to.

    View solution in original post

    2 REPLIES 2

    avatar
    Explorer
    • Storm does not support the storm-rabbitmq spout that you mentioning, it does not ship with apache storm distro so I have not looked at the code.
    • Here is a JMS spout that you may be able to use as is with RabitMQ if RAbitMQ supports JMS https://github.com/hortonworks/storm/tree/2.3-mai... (Note that this is available in HWX repo but not under apache)
    • For an example of OpaqueSpout take a look at kafka-spout. https://github.com/apache/storm/blob/master/extern...
    • The unit test part for exactly once is kind of hard in my opinion and there are no current examples.
    • I am not sure if I understand the sharing queue's question completely. Your spout could read from N queues while outputting all the messages to a single stream, that is completely up to your spout. The merge can be useful if you want to create one spout -> one queue mapping and then merge all the output streams into a single stream that other processing units (trident states) subscribe to.

    avatar
    Explorer
    • Out of the box Rabbitmq can support a single delivery of a message (Exactly Once), it really depends on how the RabbitMQ routing topology has been created. For instance, if you have a single exchange and (direct) queue then it's impossible for another queue to receive the message.
    • Instead merging streams, you may want to use RabbitMQ to do this, of course it depends on your RabbitMQ topology once again.