Support Questions

IslamGamal · ‎06-10-2022

Hello everyone

i have a content in Json (an array of id)

[{"id":7532233},{"id":755163}]

and i need to convert this whole array into one attribute so that i can after that refer to it in an ExecuteSql processor with statement :

update column 1 where id exists(${attributeName})

SAMSAL · ‎06-10-2022

Hi,

If I understood your question correctly, you want to place the file content into an attribute and store it in sql? If that is the case you can use ExtractText Processor. there you add dynamic property lets call it "IdArray" and set the regular expression to "(?s)(^.*$)" to capture everything in the file. Just be careful there is a size limit set of 1 MB to be captured if you think your data will be more then you can increase it. Hope that helps, if so please accept solution.

Thanks

View solution in original post

SAMSAL · ‎06-10-2022

Hi,

If I understood your question correctly, you want to place the file content into an attribute and store it in sql? If that is the case you can use ExtractText Processor. there you add dynamic property lets call it "IdArray" and set the regular expression to "(?s)(^.*$)" to capture everything in the file. Just be careful there is a size limit set of 1 MB to be captured if you think your data will be more then you can increase it. Hope that helps, if so please accept solution.

Thanks

IslamGamal · ‎06-10-2022

Thanks alot SAMSAL. It actually worked

but I've a two points if u can help me
1. since am not regex professional, is there a regex that could produce

7532233,755163
the id inside the array separated by comma
since i will be using a sql statement like this

SELECT  *
FROM    table
WHERE   id = ANY(ARRAY[${idArray}])

2. even i changed the Maximum Buffer Size to 5 MB and Maximum Capture Group Length to 5000
i can see that the attribute is being duplicated with name

idArray.1
Is that something i have to worry about ? while calling the attribute ?

SAMSAL · ‎06-10-2022

Hi,

regarding the first point I dont think you can do that through regex, this is more of transformation operation and you might need to use JsonJolt Processor to do transromation first to simplify the json and store all id's in an array then use regular expression to extract the ids. It will complicate matter it might affect performance. What kind of SQL are you using ? Do you have any json functions in that sql? If that is the case I would recommend you deffer this process to SQL and utilize sql function to parse json. For example in MS SQL you can use OPENJSON function to transpose json into table and then you can use this table in your query.

For the second point, this is because the way the extractText reg expression work it treats like group, so beside the main attribute you will get an indexed attribute for each matched group. You can read more about this here:

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.5.0/org.apache...

You can also use this post to see if you can extract one attirbute. I just found about it:

https://community.cloudera.com/t5/Support-Questions/Extract-whole-json-flowfilecontent-small-to-sing...

MattWho · ‎06-10-2022

@IslamGamal

Keep in mind that all the FlowFile attributes for a FlowFile are held in NiFi's JVM heap memory. Creating large attributes on your FlowFiles can quickly eat up a lot of heap memory and affect JVM performance.

Thanks,
Matt

IslamGamal · ‎06-10-2022

Hi Matt,
Thanks for alerting me, but since I need to use the extracted id array inside a sql statement and I need to refer to the id array (the only way to do that is to convert the content it to an attribute as far as I know)
Please if u have an article about how to handle the heap memory let me know
because i need to know the cycle and when it's been clean is it after the flowfile finish or something else
in other scenarios the flowfile will contains a lot of attributes and there is millions of data coming in
so I appreciate your info

Cloudera Community

Support Questions

[ NIFI ] Convert the flowfile content to an attribute

[Nifi] Converting a delimited FlowFile's content t...

NiFi - Extract content from AVRO FlowFile to attri...

Nifi flowFile content changes spontaneously

[NiFi] Combine FlowFile Attribute and Content

Nifi Processors status merge attributes into singl...

Python in ExecuteScript: Transform both flowfile a...

append an attribute to flowfile content at the end

Nifi: Compare contents of two files

NiFi Large FlowFile Attributes

NiFi : combine flow file attribute and content