04-19-2017 08:04 PM
As topic said, I want to use spoolDir source to monitor folder remotely.
there are some approaches I have tried.
1. Using spoolDir source and kafka sink in client side, then consume that certain topic in the master side then to HDFS.
2. Using spoolDir source and avro sink in client side, then listen from master side with selector.
I have tried approach 1 and it works fine. However, from my perspective the approach 2 is more direct and fast. And my objective is do some multiplexing header mappings in my master side just like:
xing agent1.sources.avro_source1.selector.header=select or agent1.sources.avro_source1.selector.mapping.spool 1=ch1 agent1.sources.avro_source1.selector.mapping.spool 2=ch2 agent1.sources.avro_source1.selector.mapping.spool 3=ch3
However, unlike taildir source there is setting of headers.<filegroupName>.<headerKey>, Spooldir seems don't have such direct setting.
May I know am I missing some configuration parameter of Spooldir to acheive my goal? or is there any alternatives I may try?