Created 05-10-2017 08:58 PM
I have two scenario's where I need help to extract the values
1) Flow file have the content as below
*********content of flow file***********
field_1=field1value&feild_2=field2value&field_3=field3value&...&fieild_n=fieldnvalue
********End of Content***************
I want to extract values of field1, field2, field3 ... field_n and store them to 3 attributes. Can I get regular expression example to do that using ExtractText in Nifi.
or say I want to extract value of field_x (1 < x < n) attribute from above list, how can I do that ?
2) If I have an attribute with value as below
**********attrbitue vlaue***********
field_1=field1value&feild_2=field2value&field_3=field3value&...&fieild_n=fieldnvalue
***********************************
I want to extract value of field_x (1 < x < n) attribute from above list, how can I do that ?
Created on 05-11-2017 01:25 PM - edited 08-17-2019 06:18 PM
You could use extractText processor configured similar to the following:
I changed the two shown standard properties and added two additional regex properties.
Using the following example input:
field_1=field1value&field_2=field2value&field_3=field3value&field_4=field4value
You will end up with the following attributes on your FlowFile:
There will be a few additional attributes created that you can ignore, but you will have a sequentially numbered attribute names with the associated values and one field_last that will have the very last value in your input string.
Thanks,
Matt
Created 05-11-2017 05:31 AM
Hi @Anil Reddy
If you are like me, and dislike RegEx, one trick you can try is to use the SplitContent processor first. Change config dropdown to use Text instead of Hexadecimal, and use the byte sequence of your pair delimiter &. This would simplify the RegEx if you wanted to use ExtractText still. Or perhaps you can explore using another SplitContent processor on the = to get the field and value tokens separately. Hopefully you can avoid the RegEx there.
As always, if you find this post helpful, please accept the answer.
Created 05-11-2017 05:57 PM
thanks for the response. The approach mentioned above does not give the ideal solution for my requirement.
Created on 05-11-2017 01:25 PM - edited 08-17-2019 06:18 PM
You could use extractText processor configured similar to the following:
I changed the two shown standard properties and added two additional regex properties.
Using the following example input:
field_1=field1value&field_2=field2value&field_3=field3value&field_4=field4value
You will end up with the following attributes on your FlowFile:
There will be a few additional attributes created that you can ignore, but you will have a sequentially numbered attribute names with the associated values and one field_last that will have the very last value in your input string.
Thanks,
Matt
Created 05-11-2017 05:55 PM
@Matt Clarke: thanks for the solution and it works specifically to the example of field_1, field_2.
Actually my intention is, I have couple of fields as part of flow file which are separated by character &. My requirement is I would like to extract the specific field based on field name. I have been trying to achieve that experimenting regular expressions but could not able to succeed so far. Infact I ma trying to understand the regular expressions you specified but could not get hold of it.
can you please let me know how to extract the fields in the flow file whose content is say
firstname=testfirstname&lastname=testlastname,email=testemail@email.com,address=test address.
I want to extract attributes firstname, lastname, email, address with appropriate values.
Created 05-11-2017 06:32 PM
Your new example does not use a "&" between all your fields. Is that a typo? I see "&" between first two fields and "," after that.
Created 05-11-2017 06:46 PM
sorry, its a typo mistake.
firstname=testfirstname&lastname=testlastname&email=testemail@email.com&address=test address.
Created 05-11-2017 06:44 PM
Are the four fields you are trying to extract the values from consistently:
"firstname"
"lastname"
"email"
"address"
Created 05-11-2017 06:49 PM
yes, i want to extract values of firstname, lastname, email, address
we can assume like below.
If the flow files content is as
&firstname=testfirstname&lastname=testlastname&email=testemail@email.com&address=test address&
I want to extract XXXX in regex &firstname=XXXXXXXx&.* as value for firstname field
extract XXXXXx in regex .*&lastname=XXXXX&.* as value for lastname etc.
Created 05-11-2017 06:50 PM
Just for tagging!