Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Replacing Flume workflow with NiFi help

avatar
Contributor

Hi there,

Currently I am trying to replace our current flume workflow using NiFi.

For a bit of background I am working with JSON files where each line is an event (bids, logins, purchase, comment etc..) with other information that comes along with that.

What I am trying to do is:

1. Read the JSON file line by line

2. Grab the event type from each line

3. Store events of the same type in say 1GB files in HDFS based on the event type e.g (events/$(event_type)). I don't want to store lots of tiny files preferably.

I have attached some of the different ways I have tried, but haven't been able to get the desired flow.

Thanks a lot!

Brendan

38480-merge.png

38479-routing.png

1 ACCEPTED SOLUTION

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
3 REPLIES 3

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Contributor

Hi there @Bryan Bende, Thanks a lot for your answer.

I think I am going to go with the RouteText option for now.

A couple questions with that,

  • Will the RouteText go to a singular UpdateAttributes processor or a separate one for each event type? I did try just one UpdateAttribute processor with event_type set as ${RouteText.Route}
  • How should be MergeContent be set up. I have tried as below but it is joining all the different event types into one file. Where I want a separate file for each event type that goes into HDFS when it is say 1GB large.

38503-merge-v2.png

avatar
Contributor

I found my problem, using ${event_type} in the Correlation Attribute Name where it should just be event_type.

All is sorted now thanks a lot for the help!