Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to pass dynamic "Max Bin Age" & "Max Number Of Entries" in MergeContent Processor Nifi

avatar
New Contributor

I would like to pass dynamic values to Max Bin Age" & "Max Number Of Entries" in MergeContent Processor. But it's not supporting expression language.

Please suggest if there is any way to achieve this.

1 ACCEPTED SOLUTION

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
3 REPLIES 3

avatar
Super Mentor

@HariAllstate 

What is the use case for wanting these to be dynamic?

 

The MergeContent processor allocates FlowFiles located in the inbound connection(s) to bins based on the configuration of these properties:

Merge Strategy
- Bin-Packing Algorithm - Keeps allocating FlowFiles to bin until both configured mins (Minimum Number of Entries and Minimum Group Size) are met. Bins can also be restricted to FlowFiles all having same value set in that attribute specified in the Correlation Attribute Name property
Defragment - FlowFile are allocated to bins based on fragment Attributes set on the inbound queued FlowFiles.

The purpose of the Max Bin Age is to prevent bins from getting stuck forever because they do not meet the criteria necessary to be merged.  Assume the scenario where the max num entries has been reached preventing any new FlowFiles from being allocated to a bin and those binned FlowFiles did not total enough size to meet the min group size.  Since both mons must be satisfied for a bin to be merged, that bin could potentially sit forever.  Max bin age when reached would force that bin to merge.  In the case of Defragment the FlowFiles would rout to failure if max bin age is reached before all fragments are allocated to a bin.  So Max Bin Age should be set to max latency you want to allow on a bin. Not clear on why you would want that to be dynamic and if it was, where would you pull that value from since many FlowFiles are allocated to a bin and they could end up having a variety of values.

Since only the "mins" must be satisfied to merge a bin, not clear why you would want a dynamic capability here as well.  Plus same applies here in that many FlowFiles are allocated to bins and may have different values.

Also note that the processor does not use Max when considering if a bin is ready to be merged.  When the MergeContent processor executes, it looks at only the FlowFiles queued on an inbound connection at that exact moment in time and allocates them to 1 or more bins. At the end of that allocation, each bin is evaluated to see if the mins were satisfied or all the fragments are in the bin and if so, it is merged.

If you found this addressed your query, please take a moment to login and click "Accept as Solution"
Thank you,

Matt

 

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Community Manager

Hi @HariAllstate, I'm glad to see you resolved your issue. Please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: