Created 09-17-2018 01:32 PM
Hi All,
I have an use case where I want to find number of occurrences of the word and want to perform an action on it.
example:
1. I have multiple flow files coming in
2. I want to extract a word (say, user_name) using extracttext processor
3. count the word
4. if user_name_count =10
5. do replacetext 10 as 1
6. putemail to user_name that user_name count is 10.
Can you please let me know which processors can be helpful for the usecase.
Suggestions are appreciated.
Created on 09-18-2018 03:09 PM - edited 08-18-2019 12:30 AM
I tried your case By using UpdateAttribute's Store the state feature.
flow:
1.Two GenerateFlowfiles //to get
2 flowfiles2.SplitText //split the flowfile into 1 line
3.ExtractText //extract the first value of the from the content
4.RouteOnAttribute //check the extracted value from the flowfile attribute
5.UpdateAttribute //add one to the seq attribute and reset the seq attribute value when it reaches to 10(advance d usage of update attribute processor)
6.RouteOnAttribute //check seq attribute value and send to putemail if seq = 10
7.PutEmail //send mail
I have attached flow template, reuse it and change as per your requirements.
Created 09-17-2018 05:32 PM
We need more details to provide correct solution for this case
1.Could you please provide some sample data for this case?
2.Do you want to count user_name in particular flowfile i.e if flowfile content having 10 times user_name then sent out email?
(or)
Count 10 flowfiles that have user_name and send out mail once the count reaches out 10?
3.Do you know the schema for the flowfile?
Created 09-18-2018 01:09 PM
1. Sample data:
Every value is present in attributes(i.e. every flowfile is parsed and the value in the flowfile is assigned to attributes)
There are multiple flow files with the same value (user_name)in attributes.
ex:
flowfile1 attributes:
user_name: mark, file_in: 2018-09-18 15:00:00, file_out: 2018-09-18 15:01:00 user_name: michelle, file_in: 2018-09-18 15:00:02, file_out: 2018-09-18 15:01:01 user_name: mark, file_in: 2018-09-18 15:00:05, file_out: 2018-09-18 15:01:01 flowfile2 attributes: user_name: mark, file_in: 2018-09-18 15:01:00, file_out: 2018-09-18 15:01:10 user_name: stella, file_in: 2018-09-18 15:01:12, file_out: 2018-09-18 15:01:21
2. I want to count all the flowfiles that have user_name (in the above example count of mark is 3 in both the flowfiles)
3. Schema of the flow file is just as above 3 fields, which are assigned to attributes.
Thank you
Created 09-17-2018 10:35 PM
While I usually recommend using the existing processors to perform individual tasks and chain them together to achieve your overall goal, I think this is a case where an ExecuteScript
processor with a custom script could be best. As long as the input is not on the order of 10 MB+ per flowfile, you should be able to perform text searching and counting pretty well with a simple Ruby, Groovy, or Python script and provide it in the output you want to route directly to the PutEmail
processor.
Otherwise, everything you want can be easily done with native processors except counting occurrences of a specific string, but you could use ExecuteStreamCommand
with awk
to achieve this. You'll just have to spend extra time converting the formats back and forth to be useful.
Created 09-18-2018 01:12 PM
Thats a nice idea, but I dont have leverage to user executescript or excecutestreamcommand, as there are no scripts/programs(including awk) waiting for me, also getting them is out of my hands, so looking for a solution with in my flex.
Thank you
Created on 09-18-2018 03:09 PM - edited 08-18-2019 12:30 AM
I tried your case By using UpdateAttribute's Store the state feature.
flow:
1.Two GenerateFlowfiles //to get
2 flowfiles2.SplitText //split the flowfile into 1 line
3.ExtractText //extract the first value of the from the content
4.RouteOnAttribute //check the extracted value from the flowfile attribute
5.UpdateAttribute //add one to the seq attribute and reset the seq attribute value when it reaches to 10(advance d usage of update attribute processor)
6.RouteOnAttribute //check seq attribute value and send to putemail if seq = 10
7.PutEmail //send mail
I have attached flow template, reuse it and change as per your requirements.
Created 10-02-2018 01:33 PM
Thankyou