Support Questions

Find answers, ask questions, and share your expertise

nifi success queue getting filled up

avatar
Expert Contributor

success.pngmereg1.pngHi All

Thanks a lot this awesome community

I am facing this problem a lot of time

The success between listentcp and mergeconten gets filled up and starts showing in red attached is the image

When I click and do list queue nothing appears

I have configure mergecontent to merge till 128 mb and using text as delimiter.

Any help what is happening

Thanks

Dheeru

6 REPLIES 6

avatar
Master Guru
@dhieru singh

That is because your merge content processor is running so that means merge content processor is working on those flowfiles.

if you want to list those flowfiles then stop merge content processor then only you can view them in nifi.

If you dont want to see the queue in red color then click on success relation and click on settings tab,

then configure that queue to big number and size as per your needs, by default these configurations are 10000 flowfiles or 1 GB size.

41629-queue-conf.png

avatar
Expert Contributor

@Shu

Thanks for your response. that helps.

However I am wondering if my mergecontent processor is waiting. Probably it is waiting for 128 MB to flush, then why not clear queue? because now it is with mergecontent processor

Thanks

Dheeru

avatar
Master Guru

@dhieru singh Yes you are right, now all those flowfiles are with merge content processor and in addition processor needs to keep track of Grouping of flowfiles based on a user-defined strategy.

avatar
Master Mentor
@dhieru singh

The mergeContent processor does not remove FlowFiles from an inbound connection until the actual merge process occurs. When mergeContent runs it allocates queued FlowFiles to one or more bins. While the FlowFiles themselves remain on the queue for tracking purposes, any allocated FlowFiles are now owned by that MergeContent processor. They will not be able to list or delete FlowFiles from this queue while the MergeContent processor is running.

In your case you have two NiFi nodes each with 10,000 FlowFiles for a total queue of 20,000 FlowFiles. Keep in mind that each of your nodes is merging only those FlowFiles that exist on its same node. In your current configuration you have two min set:

min number entries = 10,000 <-- that has men met on both nodes
min group size = 127 MB <-- This has not been met yet (you have about 2.2 MB on each node)

Because higher of the two must be met before a bin is eligible to merge.

Max number entries = 15,000 <-- you have not reached this on either node.
Max group size = 128 MB <-- You have not met this on either node.

NiFi will force a merge when either of these is reached. In your case, you would hit 15000 long before you reach 128 MB.

There is one more setting you have not set at all.

Max bin age =

Setting this will force the merge of a bin no matter what any of the above settings are if the age of the bin has exceed this configured value. I always recommend you set this to avoid FlowFiles lingering for exceptionally long periods of time.

You could increase the backpressure object threshold on the inbound connection, but the size you would need to in crease it to in order to hit your 127 MB min group size would amount to merging more then 565,000 FlowFiles. Since the MergeContent must hold the attributes for all FlowFiles being merged in heap memory, merging this number of FlowFiles in one operation is likely to result in out of memory errors.

In cases like this I recommend doing a two phase merge with multiple MergeContent processors.

I would configure first MergeContent to merge based on min (15000) and max (20000) number entries. You should also set a max bin age. You will need to make sure you push up your connection object threshold from 10000 to 20000.

The second MergeContent Processor would then be set to merge based on min (120 MB) and max (128 MB) group size. You should also set a max bin age.

The result of which will be far less heap usage but the same desired end result.
The flow would look something like this:

40050-screen-shot-2017-10-30-at-41254-pm.png

Thanks,

Matt

avatar
Expert Contributor

Hi @Matt Clarke Thanks a lot, appreciate your help

I did the same thing , need your suggestion here

https://community.hortonworks.com/questions/144575/mergecontent-processor-and-default-block-size-of-...

In addition , I need flow files based on size (120 to 128 MB default size) however it is forcing me to add the "min number entries "

I also read somewhere that is a "and" condition, in this link

https://community.hortonworks.com/content/kbentry/1876/mergecontent-processor-inner-workings.html

Thanks

Dheeru

avatar
Master Mentor

@dhieru singh

Min number of entries must be set and defaults to 1. That is fine as long as you don't set max num entries.

You are correct it is (min number of entries AND min group size) OR max number of entries OR max group size. So either of the "max " settings will force a merge just like max bin age will.