Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Data processing with multiple Kafka topics -> save to hdfs

avatar
New Contributor

Hi

In my enviroment applications writes logs to Kafka. I want to read logs (from all kafka topics) and put to HDFS but i want to do this in one flow processing (one consumer and one PutHDFS processor).

It's possible to read from multiple kafka topics (in one consumer processor) and save data to multiple directions by putHDFS processor (one) according to topic name?

Each application has a kafka topic and name of topic is from pattern: AppName_app (example: sap_app is a topic for SAP application)

In Kafka consumer processor description i see:

The name of the Kafka Topic(s) to pull from. More than one can be supplied if comma separated.
Supports Expression Language: true

Is it mean that i can use as exapmle name of topic like: "^.*_app" 

 

In PutHDFS process i want to use direction like: /{$topic_name} to save file in HDFS but it is possible to do it in one processor for multiple topics (data from topics)

1 REPLY 1

avatar
Super Guru

@lukolas The example you provide sees to be REGEX not expression language.  You would need to test some kind of expression language in that Topic Name(s) property.

 

Another suggestion would be to have a file, or for example generateflowfile, which contains a list of topics.  Then split/extract that list into attributes, then send that attribute to the topic name.  

 

Having a ton of topics going to single processors can become a bottleneck which creates tons of downstream flowfiles from each topic.  So be careful with tuning and concurrency in reference to number and topics and messages per topic.

 

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  

 

Thanks,


Steven @ DFHZ