Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Mathematical operations on flowfile

Highlighted

Mathematical operations on flowfile

Explorer

I searched for some examples for basic mathematical transformations on FlowFile content( JSON columns for example), but i did not found any. I want to do some basic multiplication, addition etc.

As i see it i can write custom JOLT transformer for this, or make it in SQL with stage tables or some other SQL way. Or to write new processor for this.

But i was wondering what is the NIFI way to do this?

Best

Bojan

6 REPLIES 6
Highlighted

Re: Mathematical operations on flowfile

@Bojan Kostic please see the Mathematical Operations and Numeric Manipulation section of NiFi's Expression Language guide. These expressions can be used in processors such as UpdateAttribute to modify FlowFile attributes. Attributes can be extracted from FlowFile content using the EvaluateJsonPath operator.

Highlighted

Re: Mathematical operations on flowfile

Downvoted because the question is about doing operations on the FlowFile content, not the attributes.

Highlighted

Re: Mathematical operations on flowfile

Edited answer to include extraction of attributes from content.

Highlighted

Re: Mathematical operations on flowfile

Guru

In addition to the EL methods in @slachterman post, if you needed to do more advanced math you could write a script (e.g. groovy/Java) and execute it with ExecuteScript to do math on the flow file contents. ExecuteScript allows you to use external dependencies (e.g. commons-math.jar) required by the script.

Highlighted

Re: Mathematical operations on flowfile

What the best way to do it depends on what the end goal of the operations is. Are you computing the updated value to replace the old one in the content so that it will store it in wherever the end destination is? Or are you computing it in order to make some routing decision later?

If you are trying to update the value in the content so that it will be stored later on, the best solution depends on the format of the content. If it is JSON, the JoltTransformJSON processor is probably your best bet. If it's XML then TransformXML would work. Outside of those two formats it would depend on your skill-set. If you like regular expressions, you can use the ReplaceText processor to create a RegEx value to search for and then a combination of References and Expression Language to make the Replacement value. Also @Greg Keys mentioned, the ExecuteScript/InvokeScriptedProcessor processor would allow you to write a script to manipulate the content as needed. Lastly, if you'd rather code in Java then a new processor may be best.

If you're computing this value purely to make routing decisions (and don't want/need to modify the content) then utilizing Attributes would be best. The first step would be to extract the initial value to an attribute and just like the first case, it depends on your content format. The "Evaluate*" processors (EvaluateJsonPath, EvaluateXPath, EvaluateXQuery) and ExtractText all take some value from the content and add it as an attribute (if these don't cover your content format you'd need to write a script or new processor). Once you have it as an attribute, you can use the EL methods @slachterman mentioned to route the FlowFile.

Re: Mathematical operations on flowfile

Explorer

Thanks all for their opinions, and in the end i will do some work with ExecuteScript controller and JSON libs for Groovy. I wanted to use JOLT but unfortunately it can't use expression language.

My idea was to change one field in flat JSON with some math operation like multiplying. And the multiplier should be from some other source.

As for the custom processor that is always solution and it is not hard. But for this i will try ExecuteScript.

Best

Bojan

Don't have an account?
Coming from Hortonworks? Activate your account here