Created 06-10-2022 11:53 AM
Hello everyone
i have a content in Json (an array of id)
[{"id":7532233},{"id":755163}]
and i need to convert this whole array into one attribute so that i can after that refer to it in an ExecuteSql processor with statement :
update column 1 where id exists(${attributeName})
Created 06-10-2022 12:35 PM
Hi,
If I understood your question correctly, you want to place the file content into an attribute and store it in sql? If that is the case you can use ExtractText Processor. there you add dynamic property lets call it "IdArray" and set the regular expression to "(?s)(^.*$)" to capture everything in the file. Just be careful there is a size limit set of 1 MB to be captured if you think your data will be more then you can increase it. Hope that helps, if so please accept solution.
Thanks
Created 06-10-2022 12:35 PM
Hi,
If I understood your question correctly, you want to place the file content into an attribute and store it in sql? If that is the case you can use ExtractText Processor. there you add dynamic property lets call it "IdArray" and set the regular expression to "(?s)(^.*$)" to capture everything in the file. Just be careful there is a size limit set of 1 MB to be captured if you think your data will be more then you can increase it. Hope that helps, if so please accept solution.
Thanks
Created 06-10-2022 01:03 PM
Thanks alot SAMSAL. It actually worked
but I've a two points if u can help me
1. since am not regex professional, is there a regex that could produce
7532233,755163
the id inside the array separated by comma
since i will be using a sql statement like this
SELECT *
FROM table
WHERE id = ANY(ARRAY[${idArray}])
2. even i changed the Maximum Buffer Size to 5 MB and Maximum Capture Group Length to 5000
i can see that the attribute is being duplicated with name
Created 06-10-2022 02:46 PM
Hi,
regarding the first point I dont think you can do that through regex, this is more of transformation operation and you might need to use JsonJolt Processor to do transromation first to simplify the json and store all id's in an array then use regular expression to extract the ids. It will complicate matter it might affect performance. What kind of SQL are you using ? Do you have any json functions in that sql? If that is the case I would recommend you deffer this process to SQL and utilize sql function to parse json. For example in MS SQL you can use OPENJSON function to transpose json into table and then you can use this table in your query.
For the second point, this is because the way the extractText reg expression work it treats like group, so beside the main attribute you will get an indexed attribute for each matched group. You can read more about this here:
You can also use this post to see if you can extract one attirbute. I just found about it:
Created 06-10-2022 12:43 PM
@IslamGamal
Keep in mind that all the FlowFile attributes for a FlowFile are held in NiFi's JVM heap memory. Creating large attributes on your FlowFiles can quickly eat up a lot of heap memory and affect JVM performance.
Thanks,
Matt
Created 06-10-2022 01:11 PM
Hi Matt,
Thanks for alerting me, but since I need to use the extracted id array inside a sql statement and I need to refer to the id array (the only way to do that is to convert the content it to an attribute as far as I know)
Please if u have an article about how to handle the heap memory let me know
because i need to know the cycle and when it's been clean is it after the flowfile finish or something else
in other scenarios the flowfile will contains a lot of attributes and there is millions of data coming in
so I appreciate your info