Support Questions
Find answers, ask questions, and share your expertise

Nifi best parctice to filter flowfiles using external file

Solved Go to solution
Highlighted

Nifi best parctice to filter flowfiles using external file

New Contributor

Hi,

Just wondering what is the best practice for my use-case. My flowfiles are json objects and I need to filter/route them using external file (with list of values) - i.e. per flowfile to check if the value of some field (key) X is in the file or not.

The only two processors I noticed I can use for that are ScanContent and ReplaceTextWithMapping (which will "replace" a value in identical one).

ScanContent seems to be more appropriate since it does not perform a redundant 'Replace' action, but on the other hand it does not have the 'File Refresh Interval' property as the ReplaceTextWithMapping. Hence I'm guessing it continuously refresh the dictionary file (I didn't find relevant information about this issue in the documents), which is also an expensive (and redundant for my use-case) action that can harm the performance of the flow.

I tend to use the ReplaceTextWithMapping approach and skip the continuous refreshing of the file, but just wanted to ask around here, to check if there is another best-practice approach and make sure I get things right / didn't miss something.

Thanks in advance,

Liran

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Nifi best parctice to filter flowfiles using external file

Super Guru

The code for ScanContent looks like it watches the specified file for changes; otherwise it shouldn't refresh the dictionary file (unless something weird happens with the internal search mechanism).

Alternatively, I answered a Stack Overflow question with a similar use case, using ExecuteScript to check the JSON (and in their case, replace the value from an external file). That example also reads the file every time, but you could use a similar approach with InvokeScriptedProcessor to read the file in the initialize() method, then it will not be re-read during onTrigger (which is called when the processor is scheduled).

View solution in original post

2 REPLIES 2
Highlighted

Re: Nifi best parctice to filter flowfiles using external file

Super Guru

The code for ScanContent looks like it watches the specified file for changes; otherwise it shouldn't refresh the dictionary file (unless something weird happens with the internal search mechanism).

Alternatively, I answered a Stack Overflow question with a similar use case, using ExecuteScript to check the JSON (and in their case, replace the value from an external file). That example also reads the file every time, but you could use a similar approach with InvokeScriptedProcessor to read the file in the initialize() method, then it will not be re-read during onTrigger (which is called when the processor is scheduled).

View solution in original post

Highlighted

Re: Nifi best parctice to filter flowfiles using external file

New Contributor

if the ScanContent watch for changes, I think it solve my problem. Thanks ! :)

Don't have an account?