<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to count the number of occurrences of a word (similar to word count) and do an action on it in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191049#M153138</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Thankyou&lt;/P&gt;</description>
    <pubDate>Tue, 02 Oct 2018 20:33:43 GMT</pubDate>
    <dc:creator>mark_hadoop</dc:creator>
    <dc:date>2018-10-02T20:33:43Z</dc:date>
    <item>
      <title>How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191043#M153132</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;I have an use case where I want to find number of occurrences of the word and want to perform an action on it.&lt;/P&gt;&lt;P&gt;example:&lt;/P&gt;&lt;P&gt;1. I have multiple flow files coming in&lt;/P&gt;&lt;P&gt;2. I want to extract a word (say, user_name) using extracttext processor&lt;/P&gt;&lt;P&gt;3. count the word&lt;/P&gt;&lt;P&gt;4. if user_name_count =10&lt;/P&gt;&lt;P&gt;5. do replacetext 10 as 1&lt;/P&gt;&lt;P&gt;6. putemail to user_name that user_name count is 10.&lt;/P&gt;&lt;P&gt;Can you please let me know which processors can be helpful for the usecase.&lt;/P&gt;&lt;P&gt;Suggestions are appreciated.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Sep 2018 20:32:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191043#M153132</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-09-17T20:32:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191044#M153133</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/23208/hadoopuserhadoop.html" nodeid="23208"&gt;@Mark&lt;/A&gt;&lt;P&gt;We need more details to provide correct solution for this case&lt;BR /&gt;1.Could you please provide some sample data for this case?&lt;/P&gt;&lt;P&gt;2.Do you want to count user_name in particular flowfile i.e if flowfile content having 10 times user_name then sent out email?&lt;/P&gt;&lt;P&gt;(or)&lt;BR /&gt;Count 10 flowfiles that have user_name and send out mail once the count reaches out 10?&lt;/P&gt;&lt;P&gt;3.Do you know the schema for the flowfile?&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 00:32:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191044#M153133</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2018-09-18T00:32:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191045#M153134</link>
      <description>&lt;P&gt;
	While I usually recommend using the existing processors to perform individual tasks and chain them together to achieve your overall goal, I think this is a case where an &lt;CODE&gt;ExecuteScript&lt;/CODE&gt; processor with a custom script could be best. As long as the input is not on the order of 10 MB+ per flowfile, you should be able to perform text searching and counting pretty well with a simple Ruby, Groovy, or Python script and provide it in the output you want to route directly to the &lt;CODE&gt;PutEmail&lt;/CODE&gt; processor. &lt;/P&gt;&lt;P&gt;
	Otherwise, everything you want can be easily done with native processors except counting occurrences of a specific string, but you could use &lt;CODE&gt;ExecuteStreamCommand&lt;/CODE&gt; with &lt;CODE&gt;awk&lt;/CODE&gt; to achieve this. You'll just have to spend extra time converting the formats back and forth to be useful. &lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 05:35:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191045#M153134</guid>
      <dc:creator>alopresto</dc:creator>
      <dc:date>2018-09-18T05:35:18Z</dc:date>
    </item>
    <item>
      <title>Re: How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191046#M153135</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;1. Sample data:&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Every value is present in attributes(i.e. every flowfile is parsed and the value in the flowfile is assigned to attributes)&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;There are multiple flow files with the same value (user_name)in attributes.&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;ex:&lt;/P&gt;&lt;PRE&gt;flowfile1 attributes:&lt;BR /&gt;user_name: mark, file_in: 2018-09-18 15:00:00, file_out: 2018-09-18 15:01:00
user_name: michelle, file_in: 2018-09-18 15:00:02, file_out: 2018-09-18 15:01:01
user_name: mark, file_in: 2018-09-18 15:00:05, file_out: 2018-09-18 15:01:01

flowfile2 attributes:
user_name: mark, file_in: 2018-09-18 15:01:00, file_out: 2018-09-18 15:01:10
user_name: stella, file_in: 2018-09-18 15:01:12, file_out: 2018-09-18 15:01:21


&lt;BR /&gt;&lt;/PRE&gt;&lt;P&gt;2.  I want to count all the flowfiles that have user_name (in the above example count of mark is 3 in both the flowfiles)&lt;/P&gt;&lt;P&gt;3.  Schema of the flow file is just as above 3 fields, which are assigned to attributes.&lt;/P&gt;&lt;P&gt;Thank you &lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 20:09:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191046#M153135</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-09-18T20:09:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191047#M153136</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/595/alopresto.html" nodeid="595"&gt;@Andy LoPresto&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Thats a nice idea, but I dont have leverage to user executescript or excecutestreamcommand, as there are no scripts/programs(including awk) waiting for me, also getting them is out of my hands, so looking for a solution with in my flex.&lt;/P&gt;&lt;P&gt;Thank you &lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 20:12:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191047#M153136</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-09-18T20:12:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191048#M153137</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/23208/hadoopuserhadoop.html" nodeid="23208" target="_blank"&gt;@Mark&lt;/A&gt;&lt;P&gt;I tried your case By using &lt;STRONG&gt;UpdateAttribute's Store the state&lt;/STRONG&gt; feature.&lt;BR /&gt;&lt;STRONG&gt;flow:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="91491-flow.png" style="width: 1451px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18453i09DF0E6CD197AAFA/image-size/medium?v=v2&amp;amp;px=400" role="button" title="91491-flow.png" alt="91491-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;PRE&gt;1.Two GenerateFlowfiles //to get &lt;BR /&gt;2 flowfiles2.SplitText //split the flowfile into 1 line&lt;BR /&gt;3.ExtractText //extract the first value of the from the content&lt;BR /&gt;4.RouteOnAttribute //check the extracted value from the flowfile attribute&lt;BR /&gt;5.UpdateAttribute //add one to the seq attribute and reset the seq attribute value when it reaches to 10(advance                  d usage of update attribute processor)&lt;BR /&gt;6.RouteOnAttribute //check seq attribute value and send to putemail if seq = 10&lt;BR /&gt;7.PutEmail //send mail&lt;/PRE&gt;&lt;P&gt;I have attached flow template, reuse it and change as per your requirements.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/91492-222030-support-update-reset.xml" target="_blank"&gt;222030-support-update-reset.xml&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 07:30:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191048#M153137</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T07:30:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to count the number of occurrences of a word (similar to word count) and do an action on it</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191049#M153138</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Thankyou&lt;/P&gt;</description>
      <pubDate>Tue, 02 Oct 2018 20:33:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-count-the-number-of-occurrences-of-a-word-similar-to/m-p/191049#M153138</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-10-02T20:33:43Z</dc:date>
    </item>
  </channel>
</rss>

