Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Merge big Json using Wait/Notify in Apache Nifi


I am having trouble with my dataflow using Nifi, Sure because I didn't understand how Wit/Notify Works. Here is my basic data flow:
- I listed my s3 bucket containing zip files.
- Extracted my Json file from the zip. Each Json is 2Gb up to 3 Gb.
- I then split my json several times to avoid 'Out of memory' to obtain sing records FlowFiles(Here comes the trouble).
- I want to flatten each flowfile
- I want to merge my flattened FF to re obtain the original Json. I know I cannot do with 'Merge Content' as my json goes through several split processors.

Can you please explain how to use the Wait/Notify?

Does the wait automatically merges the flowfiles?

Because I did not understand. I also looked at this post but it is still not clear.


Super Mentor


MergeContent should be using Defragment.
There is no default value for Max Bin Age, so not sure what you set there. If left blank, processor will wait for ever to merge a bin unless you run out of bins.
Also make sure you adjust the object and size thresholds on the connections feeding the MergeContent processors so that they are large enough to accommodate the number of splits that need to be merged.
Considering the size of the FlowFiles being merged, it may take time to merge all of them.
as far as bins, try setting 21 of them.