Reply
Explorer
Posts: 10
Registered: ‎08-10-2016
Accepted Solution

[Flume] Remote spoolDir source monitoring

As topic said, I want to use spoolDir source to monitor folder remotely.

 

there are some approaches I have tried.

1. Using spoolDir source and kafka sink in client side, then consume that certain topic in the master side then to HDFS.

 

2. Using spoolDir source and avro sink in client side, then listen from master side with selector.

 

I have tried approach 1 and it works fine. However, from my perspective the approach 2 is more direct and fast. And my objective is do some multiplexing header mappings in my master side just like:

 

agent1.sources.avro_source1.selector.type=multiplexing
agent1.sources.avro_source1.selector.header=selector
agent1.sources.avro_source1.selector.mapping.spool1=ch1
agent1.sources.avro_source1.selector.mapping.spool2=ch2
agent1.sources.avro_source1.selector.mapping.spool3=ch3

However, unlike taildir source there is setting of headers.<filegroupName>.<headerKey>, Spooldir seems don't have such direct setting. 

 

May I know am I missing some configuration parameter of Spooldir to acheive my goal? or is there any alternatives I may try?

 

Thanks everyone

 

 

Highlighted
Explorer
Posts: 10
Registered: ‎08-10-2016

Re: [Flume] Remote spoolDir source monitoring

Hi all, 

fixed the problem with 

agent.sources.spooldir_source1.interceptors=spool_interceptors
agent.sources.spooldir_source1.interceptors.spool_interceptors.type=static
agent.sources.spooldir_source1.interceptors.spool_interceptors.key=channelselector
agent.sources.spooldir_source1.interceptors.spool_interceptors.value=spool

Announcements