Support Questions
Find answers, ask questions, and share your expertise

flume - filtering data

flume - filtering data


We have a setup where flume reads from kafka and writes into hdfs.

We have a requirement where we want flume to read data from kafka for only a particular day - so we want to filter data by timestamp.

So for example : we want to read data only for February 17th 2019.

The data has within it the date in the format : "2019-02-17"

I tried the below in the my flume configuration but it did not work :

flume_agent.sources.kafka1.interceptors = regex

flume_agent.sources.kafka1.interceptors.regex.type = regex_filter

flume_agent.sources.kafka1.interceptors.regex.regex = 2019-02-17

flume_flat_agent.sources.kafka1.interceptors.regex.includeEvents = true

Appreciate any insights as to how I can achieve this.