- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Parsing flow file or attribute in Nifi
- Labels:
-
Apache NiFi
Created ‎05-10-2017 08:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have two scenario's where I need help to extract the values
1) Flow file have the content as below
*********content of flow file***********
field_1=field1value&feild_2=field2value&field_3=field3value&...&fieild_n=fieldnvalue
********End of Content***************
I want to extract values of field1, field2, field3 ... field_n and store them to 3 attributes. Can I get regular expression example to do that using ExtractText in Nifi.
or say I want to extract value of field_x (1 < x < n) attribute from above list, how can I do that ?
2) If I have an attribute with value as below
**********attrbitue vlaue***********
field_1=field1value&feild_2=field2value&field_3=field3value&...&fieild_n=fieldnvalue
***********************************
I want to extract value of field_x (1 < x < n) attribute from above list, how can I do that ?
Created on ‎05-11-2017 01:25 PM - edited ‎08-17-2019 06:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could use extractText processor configured similar to the following:
I changed the two shown standard properties and added two additional regex properties.
Using the following example input:
field_1=field1value&field_2=field2value&field_3=field3value&field_4=field4value
You will end up with the following attributes on your FlowFile:
There will be a few additional attributes created that you can ignore, but you will have a sequentially numbered attribute names with the associated values and one field_last that will have the very last value in your input string.
Thanks,
Matt
Created ‎05-11-2017 05:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Anil Reddy
If you are like me, and dislike RegEx, one trick you can try is to use the SplitContent processor first. Change config dropdown to use Text instead of Hexadecimal, and use the byte sequence of your pair delimiter &. This would simplify the RegEx if you wanted to use ExtractText still. Or perhaps you can explore using another SplitContent processor on the = to get the field and value tokens separately. Hopefully you can avoid the RegEx there.
As always, if you find this post helpful, please accept the answer.
Created ‎05-11-2017 05:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for the response. The approach mentioned above does not give the ideal solution for my requirement.
Created on ‎05-11-2017 01:25 PM - edited ‎08-17-2019 06:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could use extractText processor configured similar to the following:
I changed the two shown standard properties and added two additional regex properties.
Using the following example input:
field_1=field1value&field_2=field2value&field_3=field3value&field_4=field4value
You will end up with the following attributes on your FlowFile:
There will be a few additional attributes created that you can ignore, but you will have a sequentially numbered attribute names with the associated values and one field_last that will have the very last value in your input string.
Thanks,
Matt
Created ‎05-11-2017 05:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Matt Clarke: thanks for the solution and it works specifically to the example of field_1, field_2.
Actually my intention is, I have couple of fields as part of flow file which are separated by character &. My requirement is I would like to extract the specific field based on field name. I have been trying to achieve that experimenting regular expressions but could not able to succeed so far. Infact I ma trying to understand the regular expressions you specified but could not get hold of it.
can you please let me know how to extract the fields in the flow file whose content is say
firstname=testfirstname&lastname=testlastname,email=testemail@email.com,address=test address.
I want to extract attributes firstname, lastname, email, address with appropriate values.
Created ‎05-11-2017 06:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your new example does not use a "&" between all your fields. Is that a typo? I see "&" between first two fields and "," after that.
Created ‎05-11-2017 06:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sorry, its a typo mistake.
firstname=testfirstname&lastname=testlastname&email=testemail@email.com&address=test address.
Created ‎05-11-2017 06:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are the four fields you are trying to extract the values from consistently:
"firstname"
"lastname"
"email"
"address"
Created ‎05-11-2017 06:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes, i want to extract values of firstname, lastname, email, address
we can assume like below.
If the flow files content is as
&firstname=testfirstname&lastname=testlastname&email=testemail@email.com&address=test address&
I want to extract XXXX in regex &firstname=XXXXXXXx&.* as value for firstname field
extract XXXXXx in regex .*&lastname=XXXXX&.* as value for lastname etc.
Created ‎05-11-2017 06:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just for tagging!
