Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Update attributes of json content using rules

avatar
Contributor

Hello there,

 

I have flowfiles with json content, which I transform before using them to send a post request with InvokeHTTP.

I have an attribute "url_value" which I can use to determine my "api_value". For now I use EvaluateJsonPath to add those values to the flowfile attributes, then I use UpdateAttribute where I have set some rules to get the correct api_value.

Now my question: how do I get the value back into my content? If I use AttributesToJSON, it overrides all previous content, therefore I need to have extracted all necessary attributes with EvaluateJsonPath before. This is not a very elegant solution since there can be many entries and I actually only need to update one of them!

Is there no processor similar to UpdateAttribute but for flowfile content? How could I use JoltTransformJSON to change a value depending on another attribute?

 

Edit:

I would need something like if url_value:contains("webpage1") set api_value to "wepage1"

elseif url_value:contains("webpage2") set api_value to "webpage2" and so on and so forth

 

I appreciate any hints. Thanks

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hey @Fredi , I believe the answer for your problem is the processor UpdateRecord.

 

Update record allows you to directly manipulate the fields in your file content. You add dynamic properties to the processor where the key of the property is /<field> (so in your case, '/api_value'), and in the value of this dynamic property you can write down some logic to determine what value to insert into api_value.

 

In the processor, there is a field called "Replacement Value Strategy", which defines how the value of the property will be read. If you set this to "Record Path Value", it means you can now give a path to a different field in your file (url_value!) - I can't test this right now because I'm not at my office, but I'm not entirely sure whether you can manipulate the result after giving a record path (to extract the api_value from the evaluated url_value).

 

Regardless, I'm just about 100% sure this can be done with two processors - One EvaluateJsonPath to extract the url_value into an attribute, then UpdateRecord that uses the 'Literal Value' replacement strategy - with this strategy, you can just add a property with key '/api_value' and value '${url_value}' (or whatever attribute name you gave to the extracted url_value) and once you can access url_value with the expression language (via  ${url_value}) you can use all the available functions to manipulate expression language variables.

 

Here's an article with a couple of examples on UpdateRecord:

https://community.cloudera.com/t5/Community-Articles/Update-the-Contents-of-FlowFile-by-using-Update... 

(I noticed in the article they used some recordPath related functions like "replaceRegex", so I believe there might be a way to use these and then limit the entire issue to just one UpdateRecord processor! Sadly I'm not too familiar with these myself and this was the first time I've seen them)

 

And here's the expression language documentation:

https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html 

You can see there are lots of useful functions to extract your api_value once you have ${url_value} as an attribute variable, for example "substring", "find"/"replace", "ifElse", etc. all of which you can try and use to ensure only the api_value is left in the end.

 

Hope this helps! I'm sure using ReplaceText and possibly JoltTransform could provide alternate solutions to the issue, however I believe UpdateRecord is the cleanest solution for this and truly makes use of the processor's abilities. If you struggle to use it correctly, you can reply with an example json and expected output and I'll try to write down the flow when I have time.

View solution in original post

5 REPLIES 5

avatar
Expert Contributor

Hello,
you are absolutely correct that the more elegant solution for this would be the JoltTransfromJSON Processor. Here you can add or replace with a specification, your modified value to the content, which is in the attribute.
If you don't have any experience with JOLT yet, feel free to post your content here including your requirement where the new value should be added.

avatar
Contributor

Thanks for your reply!!

I would need something like if url_value:contains("webpage1") set api_value to "wepage1"

elseif url_value:contains("webpage2") set api_value to "webpage2" and so on and so forth

I don't see how I can do that using JOLT

avatar
Contributor

Hi @Fredi!

 

The processor truly similar to UpdateAttribute but for flowfile content is ReplaceText.

Capture the initial group and enrich it with attributes, for example with Replacement Value as following:

$1, \"url_value\": \"${url_value}\"

 

adjust it according to your JSON structure.

avatar
Contributor

Thanks for your replies!!

I forgot to mention, the first value contains the value needed for my url_value. That's why I need rules as you can set in the UpdateAttribute. I have something like "www.webpage.com/sites " and need to extract "webpage" for my api_value. I don't see how I can do this with ReplaceText or with JOLT.

I would need something like if url_value:contains("webpage1") set api_value to "webpage1"

elseif url_value:contains("webpage2") set api_value to "webpage2" and so on and so forth

avatar
Expert Contributor

Hey @Fredi , I believe the answer for your problem is the processor UpdateRecord.

 

Update record allows you to directly manipulate the fields in your file content. You add dynamic properties to the processor where the key of the property is /<field> (so in your case, '/api_value'), and in the value of this dynamic property you can write down some logic to determine what value to insert into api_value.

 

In the processor, there is a field called "Replacement Value Strategy", which defines how the value of the property will be read. If you set this to "Record Path Value", it means you can now give a path to a different field in your file (url_value!) - I can't test this right now because I'm not at my office, but I'm not entirely sure whether you can manipulate the result after giving a record path (to extract the api_value from the evaluated url_value).

 

Regardless, I'm just about 100% sure this can be done with two processors - One EvaluateJsonPath to extract the url_value into an attribute, then UpdateRecord that uses the 'Literal Value' replacement strategy - with this strategy, you can just add a property with key '/api_value' and value '${url_value}' (or whatever attribute name you gave to the extracted url_value) and once you can access url_value with the expression language (via  ${url_value}) you can use all the available functions to manipulate expression language variables.

 

Here's an article with a couple of examples on UpdateRecord:

https://community.cloudera.com/t5/Community-Articles/Update-the-Contents-of-FlowFile-by-using-Update... 

(I noticed in the article they used some recordPath related functions like "replaceRegex", so I believe there might be a way to use these and then limit the entire issue to just one UpdateRecord processor! Sadly I'm not too familiar with these myself and this was the first time I've seen them)

 

And here's the expression language documentation:

https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html 

You can see there are lots of useful functions to extract your api_value once you have ${url_value} as an attribute variable, for example "substring", "find"/"replace", "ifElse", etc. all of which you can try and use to ensure only the api_value is left in the end.

 

Hope this helps! I'm sure using ReplaceText and possibly JoltTransform could provide alternate solutions to the issue, however I believe UpdateRecord is the cleanest solution for this and truly makes use of the processor's abilities. If you struggle to use it correctly, you can reply with an example json and expected output and I'll try to write down the flow when I have time.