08-24-2018 07:59 AM - last edited on 08-30-2018 06:29 AM by cjervis
Hi, is it possible for flume to process multiple kafka topics with 1 agent, 1 source, 1 channel and 1 sink?
We have like 20 kafka topics to copy into hdfs using flume.
Wondering if i should create a flume agent for each topic - or use one agent with 20 sources/20 channels/20 sinks - or one agent with 1 source/channel/sink - if that is possible.
Appreciate the feedback.
08-29-2018 08:16 PM
Since CDH 5.8, flume KafkaSource is able to consumer from multiple topics:
use the kafka.topics property, Comma-separated list of topics the kafka consumer will read messages from.
08-30-2018 10:21 AM - edited 08-30-2018 10:23 AM
I have a few more questions on the scope of flume :
1. Is it possible to have multiple flume agents running on multiple hosts? So if we have 12 topics, can we split them to be processed across three hosts each with a flume agent running?
2. For multiple topics for the same source, so would that mean one channel/sink as well or multiple channels/sinks?
also we create directories in hdfs based on the name of the topic - so for the below parameters is there a runtime variable we can use to specify the topic name? the topic name used below is 'table1'.
flume2.sinks.sink1.hdfs.path = /tmp/table1/
flume2.sinks.sink1.hdfs.filePrefix = table1-
Appreciate the insights.
08-30-2018 01:22 PM