question Nifi Extract Text - Match on text and return the characters that follow in Support Questions

Nifi Extract Text - Match on text and return the characters that follow

JamesE — Thu, 16 Jan 2020 16:15:47 GMT

Very new to Nifi and regex.

I have a test txt file with a mock log file in.
Format along the lines of:

srcip=10.10.10.10 timestamp=152532431 action="denied"

What I need is to match against the word and then return everything after the '=' until the next space character.

Any help would be appreciated.

Re: Nifi Extract Text - Match on text and return the characters that follow

MattWho — Thu, 16 Jan 2020 18:23:29 GMT

@JamesE

The ExtractText processor is used to extract text from the content of the FlowFie using a Java Regular Expression and insert that extracted text in to FlowFile attributes.

So using your FlowFile content example here:

srcip=10.10.10.10 timestamp=152532431 action="denied"

What is your desired end result?
Three separate FlowFileAttributes? One for each "word" (srcip, timestamp, and action).

Assuming above, you would add three new properties to the ExtractText processor (one for each extracted value) as follows:

For each dynamic property added via the "+" icon, The property name becomes the FlowFile attribute name and the resulting string from capture group within the Java regular expression becomes the value assigned to that new FlowFile attribute.

Hope this helps,

Matt

Re: Nifi Extract Text - Match on text and return the characters that follow

JamesE — Thu, 16 Jan 2020 23:36:26 GMT

Hi Matt

Thank you for the reply.

That is what I am after however, say those three attributes were in a long list of say 25 attributes and I only wanted certain ones.

Would I have to list all of them like you have, to get the desired flowfile attributes out?

For instance:

srcip=10.10.10.10 timestamp=152532431 action="denied" logver=12 tz="UTC+0" logid="0000012" dstip=12.12.12.12

Say I wanted srcip, action and dstip but none of the others. Would I need to list each attribute within each new property?

Say would this not work and why?

Re: Nifi Extract Text - Match on text and return the characters that follow

MattWho — Fri, 17 Jan 2020 22:05:57 GMT

@JamesE

You can handle this easily using a different set of Java Regular Expressions:

.*action=(.*?) .* .*srcip=(.*?) .* .*timestamp=(.*?) .*

If it is possible that any one of these fields may be the very last field in the content line, for this to work you would need to append a blank space to the end of the content using the ReplaceText processor before sending your FlowFile to your ExtractText processor. You need to have a blank space following each value so regex know where the value ends for each field.

Your ReplaceText processor configuration would look like this:

The "Replacement Value" is just a single space.

Hope this helps,

Matt