Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

flume - filtering data

Highlighted

flume - filtering data

Explorer

We have a setup where flume reads from kafka and writes into hdfs.

We have a requirement where we want flume to read data from kafka for only a particular day - so we want to filter data by timestamp.

So for example : we want to read data only for February 17th 2019.

The data has within it the date in the format : "2019-02-17"

I tried the below in the my flume configuration but it did not work :

flume_agent.sources.kafka1.interceptors = regex

flume_agent.sources.kafka1.interceptors.regex.type = regex_filter

flume_agent.sources.kafka1.interceptors.regex.regex = 2019-02-17

flume_flat_agent.sources.kafka1.interceptors.regex.includeEvents = true


Appreciate any insights as to how I can achieve this.