Support Questions

Find answers, ask questions, and share your expertise

Applying String Manipulations/ Mathematical operations to the contents of a flow file in nifi

avatar
New Contributor

I have a flow file coming in, which has fixed width data in the following format :

ABC 0F 15343543543454434 gghhhhhg

ABC 01 43353434343414343 hjvh

I want to have my output data in the following format:

ABC|15|15343543543454434|gghhhhhg

ABC|1|433534343434341434|hjvh

to get this output I need to convert the second field in each line to base10 integer and apply a strip operation to all the other fields to trim the white spaces.

I tried using the replaceText processor but I could not find a way to convert the second field to a base10 integer or apply strip function to the string fields.

1 ACCEPTED SOLUTION

avatar

Working with hexadecimal numbers is not something that is easily done in a current release of NiFi. In order to get it to work you'd need to use one of the scripting processors ExecuteScript or InvokeScriptedProcessor.

That said, doing numeric evaluations is one of my focuses in this upcoming release (which is currently being curated to be finalized) and I've been able to create a solution involving just the ReplaceText processor. I used the following configuration:

Search Value: ^(\w*)\ *(\w*)\ *(\d*)\ *(\w*)$ 
Replacement Value: $1|${'$2':prepend('0x'):append('p0'):toNumber()}|$3|$4 
Replacement Strategy: Regex Replace 
Evaluation Mode: Line-by-line 

The rest is up to your use-case (ie. which ever character set it is in). The search value will create capture groups for each of the sections. Then in the replacement value I utilize the second (the one for the hex digit) in an Expression language function to convert to base 10. The purpose of the "append" and "prepend" is that on the current master only decimals/double accept hex numbers (I need to improve that) so I just make it format it as a double.

So it is unfortunate this use-case isn't currently handled out of the box, it soon will be!

View solution in original post

3 REPLIES 3

avatar

Working with hexadecimal numbers is not something that is easily done in a current release of NiFi. In order to get it to work you'd need to use one of the scripting processors ExecuteScript or InvokeScriptedProcessor.

That said, doing numeric evaluations is one of my focuses in this upcoming release (which is currently being curated to be finalized) and I've been able to create a solution involving just the ReplaceText processor. I used the following configuration:

Search Value: ^(\w*)\ *(\w*)\ *(\d*)\ *(\w*)$ 
Replacement Value: $1|${'$2':prepend('0x'):append('p0'):toNumber()}|$3|$4 
Replacement Strategy: Regex Replace 
Evaluation Mode: Line-by-line 

The rest is up to your use-case (ie. which ever character set it is in). The search value will create capture groups for each of the sections. Then in the replacement value I utilize the second (the one for the hex digit) in an Expression language function to convert to base 10. The purpose of the "append" and "prepend" is that on the current master only decimals/double accept hex numbers (I need to improve that) so I just make it format it as a double.

So it is unfortunate this use-case isn't currently handled out of the box, it soon will be!

avatar
New Contributor

Thank you for your reply. However, I have a follow up question. My assumption is that toRadix function converts from a base 10 to any other base. My requirement is to convert Hexadecimal to base 10. Could you please explain how ${'$2':prepend('0x'):append('p0'):toRadix(10)} would convert OF to 15?

Also, I tried the expression but it fails due to the following error : java.lang.classCastException

avatar

Ah you are right about the "toRadix" I was using it in a way that was wrong but still functioned correctly. It should be replaced with "toNumber()". What that expression is doing is formatting it as will be accepted as a "decimal" and then converting it to a whole number. For NIFI-2950 I am also adding a "fromRadix" function that will convert from one base to base 10. When you tried the expression, did you do it using a latest build off master or a released NiFi version? I just posted a Pull Request (PR) adding proper support for whole number hex values and a fromRadix() function. Please, feel free to test and post feedback on the PR to github. This should allow you to simply do "${'$2':fromRadix(16)}". Here is the link: https://github.com/apache/nifi/pull/1161