Created 04-13-2016 09:37 PM
How would I go about configuring multiple flume agents to fetch data from an MQ messaging broker? So that they don't duplicate messages back to their sink.
Created 04-15-2016 11:40 AM
Well, based on what we know so far, i'd say 2 flume agents with the file or jdbc channel should work for you.
There will be no overlap in data because is controlled by MQ itself, so it not a matter of flume.
From flume processing side we ensure that no data loss happens by using file or jdbc channel.
Created 04-14-2016 08:19 AM
Can you explain a bit the issue with MQ? Im not an expert in WebSphere, but seems MQ is supposed to deliver each event only once. So, there should be no duplicates by design. Is it correct?
Created 04-14-2016 03:52 PM
Hi @Michael M - good question. I think my understanding of the MQ Queue was incorrect - where I thought if data is read, the data still exists on the queue when in fact that data is gone. The flume agent is set to use the memory channel and not the file channel, so if the agent crashes, what has been ingested from the source is lost. This may be the wrong approach because if the agent reads off the queue, that data off of the queue is no longer available for consumption. So if the agent crashes (and is using Memory Channel), that data is lost right? Multiple flume agents reading from the same queue won't step on each other because of this, right?
Created 04-15-2016 12:39 PM
Basically JMS standard never delivers an acknowledged message twice. So yes each message goes to one flume agent. There is no replication in it and you need to make sure that agent doesn't have outages ( raided discs, file channel, ... )
MQ systems provides different ways to provide reliability for example Publish subscribe. But I don't think Flume supports that.
http://www.ibm.com/support/knowledgecenter/#!/SSFKSJ_7.0.1/com.ibm.mq.amqnar.doc/ps20010_.htm
There is also the possibility to duplicate each message to two topics. however in this case you need to do a deduplication somewhere in your ingest logic. ( Flume would not work here you would need to do that downstream when processing the messages )
Created 04-15-2016 11:40 AM
Well, based on what we know so far, i'd say 2 flume agents with the file or jdbc channel should work for you.
There will be no overlap in data because is controlled by MQ itself, so it not a matter of flume.
From flume processing side we ensure that no data loss happens by using file or jdbc channel.
Created 04-19-2017 01:44 PM
Ryan, can i ask you how do you set up flume to fectch messages from MQ?