Support Questions

Find answers, ask questions, and share your expertise

How to iterate on an array and extract value in nifi?

avatar
Explorer

How can I iterate on this array in NIfi, I am extracting this array from an API response in nifi but unable to extract every index value one by one for completing my whole flow.

Example: Need 6100 in any variable so that I can use it for further process and in next iteration need 6101 in that variable and so on.

Edit: This array is extracting like this only, it is not assigned to any named array such as arr[] or anything.

 

[ "6100", "6101", "6102", "6103", "6104", "6105", "6106", "6107", "6108", "6423", "23382", "354548" ]
10 REPLIES 10

avatar

hi @anony,
Is this the content of your FlowFile coming from the API as a response?
If yes, you could split your array into several FlowFiles and process them in whatever order you would like.

 

I assume that you do not need your square brackets and your double quotes for your processing so you could use a ReplaceText to remove all of them.

In your ReplaceText Processor, you need to define a java regex which will perform this remove of double quotes and square brackets.

If you do not succeed in writing that java regex, you can try a combination of 4 ReplaceText processors:
- the first one defined as Search Value = " and Replacement Value set to empty string. Make sure you use Replacement Strategy = Literal Replace.
- the second one defined as Search Value =   (empty space here) and Replacement Value set to empty string. Make sure you use Replacement Strategy = Literal Replace.
- the third one defined as Search Value = [ and Replacement Value set to empty string. Make sure you use Replacement Strategy = Literal Replace.

- the forth one defined as Search Value = ] and Replacement Value set to empty string. Make sure you use Replacement Strategy = Literal Replace.

Once you removed all those, proceed with using a SplitContent, defined as follows:
- Byte Sequence Format = Text

- Byte Sequence = ,

This will send forward in your stream a flowfile containing each value from your initial Array.
Basically you will have 12 flowfiles, each with one value from your initial array.

PS: as you did not provide any information regarding the scope of your iteration, I have somehow ignored it as I did not fully understand what you meant.

avatar
Explorer

Hi, I had done it and as you mentioned I will get 12 flowfiles, I got the same but the problem is I need only 1 value at a time as we do in a loop.

 

So the question is how to use the flowfile value for running the loop OR "how to run loop on flowfiles" ?

 

I have 12 flowfiles now for further process I need the 1st index values that is the first flowfile value and process further and in second time I need 2nd index value so that I can process further and 3rd need the index value and so on for all flow files. 

 

Now question rises how can I iterate on flowfiles?

avatar
Explorer

Actually this is my actual question in case you can help with this How can I extract this key value "9876" from the g... - Cloudera Community - 368878 (please see reply also, I was unable to edit the question So I added the Correct information in one comments reply).

avatar

As you described both of your posts, they seem to be totally different. I suggest you to keep only a single post if both posts are for the same problem, as you will have better chances of receiving a proper answer. I will respond, for now, only to the one reported here and not the other one.

Now, as for the problem reported here, If you want to sequentially process your FlowFiles, you might try introducing a EnforceOrder Processor, in which you define the Group Identifier = ${filename} (after SplitContent, all your resulting FlowFiles will have the same filename) and Order Attribute = fragment.index. In addition, you can set the Initial Order to 1. The queue entering the EnforceOrder I would recommend to be set as Load Balance Strategy = Single Node, to make sure that everything is on the same node and you process them in your desired order. From EnforceOrder, you can go afterwards in whatever processor you would like, with the condition to have everything on the same node, otherwise, each node from your cluster will execute a different flowfile ( I do not know if that is what you want or not ).

Otherwise, you could create an endless list of RouteOnAttribute processors, but this will require you to know exactly how many items you have in your array. --> not really recommended though, but it is an option on the table.

Another solution would be to write yourself a custom script and execute it within NiFi. You have some processors like ExecuteStreamCommand or ExecuteScript/ExecuteGroovyScript.

avatar
Explorer

Hi I did the same and it worked for the first value of array now the question is how to process on 2nd  value of array? I mean how to to take 2nd index value in 2nd call of loop? 

avatar

yea, sorry, I do not understand your question, nor your problem right now.
With what I have described, from your original flowfile, you will get 12 flowfiles, each of them containing a single unique value from your array. Using EnforceOrder, you force NiFi in a matter to processes the flowfiles based on how they have been generated. This means that the first flowfile will have the first value from your array, the second flowfile will have the second value from your array and so on. Now, these flowfile are sent down the stream in your next processor and they get processed in that order. So after processing 6100(with the first flowfile), you will process 6101(with the second flowfile) and so on.

avatar
Explorer

That is correct but when I am setting relationship of EnforceOrder with ExtractText processor only 1 file is going to ExtractText under (overtook, wait ,skipped) relationship, Rest 11 files are going under success. While I want to loop on all of them one by one that is iteration. 

 

So After one complete iteration How can I get 2nd flowfile value in ExtractText processor.

 

Also the relationship is also a doubt of concern . Actually I dont know whether it should be under success or overtook,wait  and skipped . All I need is I want every flowfile to go to ExtractText Processor for further flow one by one.

 

anony_0-1682327141816.png

 

avatar

here is the documentation for EnforceOrder: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.20.0/org.apach...
Have a look here and you will understand what each queue represents.

cotopaul_0-1682327628390.png


Based on your screenshot, you have configured EnforceOrder incorrectly. I just tested the same behavior I wrote you a couple of messages ago and it works just fine. All the messages go into success (which in your case should be Extract Text).

cotopaul_1-1682328241065.png

 

If you do not want to process them one after the other, I assume you would need to implement a Wait-Notify --> an example of what you might try can be found here: https://pierrevillard.com/2018/06/27/nifi-workflow-monitoring-wait-notify-pattern-with-split-and-mer...

avatar
Explorer

So wait-notify would be added in between EnforceOrder and extract text ? So that next flowfile will execute after one flowfile process completes?