<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Count number of incoming flowfiles in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336239#M232205</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95033"&gt;@OliverGong&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks a lot for your helpful answer.&lt;/P&gt;&lt;P&gt;It increments the fragment.index atrribute until the BatchSize parameter value.&lt;/P&gt;&lt;P&gt;It works when I kow how much files I want to merge so I set the value in the BatchSize variable&lt;/P&gt;&lt;P&gt;But when I don't know how many files to merge (from business users) the fragment.count is not set correclty.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to get&amp;nbsp; dynamically the number of incoming files?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 14 Feb 2022 13:49:15 GMT</pubDate>
    <dc:creator>yamaga</dc:creator>
    <dc:date>2022-02-14T13:49:15Z</dc:date>
    <item>
      <title>Count number of incoming flowfiles</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336203#M232190</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to get the number of the files in the input then assign the count value to an attribute and assign a number value to each file&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have files to merge using MergeContent so I should assign fragment.index for each file and fragment.count as total count of files to merge.&lt;/P&gt;</description>
      <pubDate>Mon, 14 Feb 2022 09:04:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336203#M232190</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-14T09:04:44Z</dc:date>
    </item>
    <item>
      <title>Re: Count number of incoming flowfiles</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336224#M232202</link>
      <description>&lt;P&gt;Thank you for your question.&lt;BR /&gt;&lt;BR /&gt;You may try using &lt;STRONG&gt;UpdateAttribute&lt;/STRONG&gt; Processor's &lt;STRONG&gt;stateful value&lt;/STRONG&gt; to deal with the incoming flow files in a batch mode.&lt;BR /&gt;&lt;BR /&gt;============================&lt;/P&gt;&lt;P&gt;Here is the settings for UpdateAttribute&lt;BR /&gt;============================&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="OliverGong_0-1644836598754.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33549i0A10A6A7AC49CE35/image-size/medium?v=v2&amp;amp;px=400" role="button" title="OliverGong_0-1644836598754.png" alt="OliverGong_0-1644836598754.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Under the &lt;STRONG&gt;Advanced&lt;/STRONG&gt; Mode of &lt;STRONG&gt;UpdateAttribute&lt;/STRONG&gt; Processor:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Set two rules as below:&lt;UL&gt;&lt;LI&gt;R0 -&amp;gt; initializeBatchIndex&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Conditions&lt;/STRONG&gt;:&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;${getStateValue("fragment.index"):equals(-1):or(${getStateValue('fragment.index'):plus(1):ge(${batchSize})})}&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;Actions&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;EM&gt;(add fragment related attributes)&lt;/EM&gt;:&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;fragment.count&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;UL&gt;&lt;LI&gt;${batchSize}&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;fragment.identifier &lt;/SPAN&gt;&lt;/STRONG&gt;(For each batch, it should generate a new UUID as the identifier)&lt;UL&gt;&lt;LI&gt;${UUID()}&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;fragment.index&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;${getStateValue('fragment.index'):plus(1):mod(${batchSize})}&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;R1 -&amp;gt;&amp;nbsp;Iterations&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Conditions:&lt;/STRONG&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;${getStateValue("fragment.index"):equals(-1):or(${getStateValue('fragment.index'):plus(1):ge(${batchSize})}):not()}&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;Actions&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;EM&gt;(add fragment related attributes)&lt;/EM&gt;:&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;fragment.count&lt;/STRONG&gt;&lt;I&gt;&lt;EM&gt;(This parameter may be optional as it always be the same size around one specific batch test )&lt;/EM&gt;&lt;/I&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;${getStateValue('fragment.count')}&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;fragment.identifier&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;${getStateValue('fragment.identifier')}&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;fragment.index&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;${getStateValue('fragment.index'):plus(1):mod(${batchSize})}&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;NOTE:&lt;BR /&gt;Before that, we can set a Variables in your current &lt;STRONG&gt;Process Group&lt;/STRONG&gt;( right click en empty area inside your process group, select variables, and add a variable named &lt;STRONG&gt;batchSize, with proper merged count you wanna set&lt;/STRONG&gt;)&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="OliverGong_1-1644837695656.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33550iE3E240A9641AB0D9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="OliverGong_1-1644837695656.png" alt="OliverGong_1-1644837695656.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="OliverGong_2-1644837915558.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33551iA0B7D536F51D30E8/image-size/medium?v=v2&amp;amp;px=400" role="button" title="OliverGong_2-1644837915558.png" alt="OliverGong_2-1644837915558.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;The result of the merged flow files would be merged via the same fragment.identifier.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="OliverGong_4-1644838118741.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33553iEE4E56768FEEA671/image-size/medium?v=v2&amp;amp;px=400" role="button" title="OliverGong_4-1644838118741.png" alt="OliverGong_4-1644838118741.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="OliverGong_5-1644838635937.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33554i2471DE571C2AF089/image-size/medium?v=v2&amp;amp;px=400" role="button" title="OliverGong_5-1644838635937.png" alt="OliverGong_5-1644838635937.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please let me know if this helps.&lt;BR /&gt;&lt;BR /&gt;Thanks &amp;amp; Regards,&lt;BR /&gt;Oliver Gong&lt;/P&gt;</description>
      <pubDate>Mon, 14 Feb 2022 11:40:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336224#M232202</guid>
      <dc:creator>OliverGong</dc:creator>
      <dc:date>2022-02-14T11:40:32Z</dc:date>
    </item>
    <item>
      <title>Re: Count number of incoming flowfiles</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336239#M232205</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95033"&gt;@OliverGong&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks a lot for your helpful answer.&lt;/P&gt;&lt;P&gt;It increments the fragment.index atrribute until the BatchSize parameter value.&lt;/P&gt;&lt;P&gt;It works when I kow how much files I want to merge so I set the value in the BatchSize variable&lt;/P&gt;&lt;P&gt;But when I don't know how many files to merge (from business users) the fragment.count is not set correclty.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to get&amp;nbsp; dynamically the number of incoming files?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Feb 2022 13:49:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336239#M232205</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-14T13:49:15Z</dc:date>
    </item>
    <item>
      <title>Re: Count number of incoming flowfiles</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336256#M232206</link>
      <description>&lt;P&gt;I used &lt;STRONG&gt;GetHDFSFileInfo&lt;/STRONG&gt; to get the numbe of incoming files with&amp;nbsp;&lt;STRONG&gt;hdfs.count.files&lt;/STRONG&gt; attribute&lt;/P&gt;&lt;P&gt;Then at the end of the dataflow I move the processed files into a separate folder so only files to merge stay in the root folder.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks to&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95033"&gt;@OliverGong&lt;/a&gt;&amp;nbsp;for the hint &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Feb 2022 18:08:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Count-number-of-incoming-flowfiles/m-p/336256#M232206</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-14T18:08:42Z</dc:date>
    </item>
  </channel>
</rss>

