Support Questions

Find answers, ask questions, and share your expertise

NIFI - Extract data from flowfile to write a Query

avatar
Contributor

Hi, I have a flowfile with this format:

<td>Item1</td>
<td class="dest">50.3421</td>
<td class="dest">20.5547</td>

I need to write a query with the parameters, so I need to extract the numbers, with this for example:

(\d{2}.\d{4})

And put the 2 resultos in:

Insert into table(a,b) VALUES($1,$2)

How can I do this?

Thanks!

 

I tryed with replaceText but it's not what im looking for

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Alexandros 

 

You can accomplish this use ReplaceText with a more complex Java regular expression.

The Replace Text is designed to replace every occurrence of the string matched by your java regular expression with the replacement value.  So you are probably seeing your replacement value inserted into your existing content twice.

Try using the following java regular expression which will match your entire 3 lines of content:

.*\n.*?(\d{2}.\d{4}).*?\n.*?(\d{2}.\d{4}).*


Leave your replacement value as you already have it and make sure you have Evaluation mode still set to Entire text.

 

Hope this helps,

Matt

View solution in original post

2 REPLIES 2

avatar

The first thing that comes to mind is the Extract Text processor. It allows you to get (multiple) parts from the text and put it into attributes.

 

org.apache.nifi.processors.standard.ExtractText


- Dennis Jaheruddin

If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'.

avatar
Master Mentor

@Alexandros 

 

You can accomplish this use ReplaceText with a more complex Java regular expression.

The Replace Text is designed to replace every occurrence of the string matched by your java regular expression with the replacement value.  So you are probably seeing your replacement value inserted into your existing content twice.

Try using the following java regular expression which will match your entire 3 lines of content:

.*\n.*?(\d{2}.\d{4}).*?\n.*?(\d{2}.\d{4}).*


Leave your replacement value as you already have it and make sure you have Evaluation mode still set to Entire text.

 

Hope this helps,

Matt