<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Nifi - MergeContent - Multiple CSV files - counter in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336161#M232162</link>
    <description>&lt;P&gt;Are all the files similar and you assign the fragment indexes in a round robin fashion? (1,2,3,4,5,6,1,2,3,...)&lt;/P&gt;&lt;P&gt;Or do the different index numbers identify different types of files?&lt;/P&gt;&lt;P&gt;When you merge, can you merge as many files as possible or do they always need to be merged 6 by 6?&lt;/P&gt;&lt;P&gt;Can you give an example of how you are going to use the index in the QueryRecord processor?&lt;/P&gt;</description>
    <pubDate>Sun, 13 Feb 2022 01:30:46 GMT</pubDate>
    <dc:creator>araujo</dc:creator>
    <dc:date>2022-02-13T01:30:46Z</dc:date>
    <item>
      <title>Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336116#M232140</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;I want to merge 6 CSV files into 1&lt;/P&gt;&lt;P&gt;I use&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ListHDFS &amp;gt;&amp;gt; FechHDFS &amp;gt;&amp;gt; UpdateAttribute&lt;/STRONG&gt; &amp;gt;&amp;gt; &lt;STRONG&gt;&lt;FONT color="#008000"&gt;MergeContent &amp;gt;&amp;gt; QueryRecord &amp;gt;&amp;gt;&lt;/FONT&gt;&lt;/STRONG&gt; ...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ListHDFS &amp;gt;&amp;gt; FechHDFS &amp;gt;&amp;gt; UpdateAttribute&lt;/STRONG&gt; is repeated as the number of files to merge ( 6 times)&lt;/P&gt;&lt;P&gt;because I shoud to give for each file the &lt;EM&gt;fragment.index&lt;/EM&gt; parameter and an allias ( used later for the join query in QueryRecord )&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="yamaga_0-1644591989998.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33534i1F83A884B0DFF321/image-size/medium?v=v2&amp;amp;px=400" role="button" title="yamaga_0-1644591989998.png" alt="yamaga_0-1644591989998.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The UpdateAttribute for one of the files:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="yamaga_1-1644592040948.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33535iDCBC0DB7FC06D356/image-size/medium?v=v2&amp;amp;px=400" role="button" title="yamaga_1-1644592040948.png" alt="yamaga_1-1644592040948.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to avoid multiple processors to get the files&amp;nbsp;&lt;STRONG&gt;ListHDFS &amp;gt;&amp;gt; FechHDFS &amp;gt;&amp;gt;&amp;nbsp; UpdateAttribute&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;How to reduce is into one&amp;nbsp;&lt;STRONG&gt;ListHDFS &amp;gt;&amp;gt; FechHDFS &amp;gt;&amp;gt; UpdateAttribute &lt;/STRONG&gt;and give a different &lt;EM&gt;&lt;STRONG&gt;fragment.index&lt;/STRONG&gt;&lt;/EM&gt; for each different file which shloud be between 0 and 6 (max number of files) ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried NextInt() to attribute a new fragment.index value but it is incremental, not suitable for multiple executions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Feb 2022 15:15:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336116#M232140</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-11T15:15:45Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336161#M232162</link>
      <description>&lt;P&gt;Are all the files similar and you assign the fragment indexes in a round robin fashion? (1,2,3,4,5,6,1,2,3,...)&lt;/P&gt;&lt;P&gt;Or do the different index numbers identify different types of files?&lt;/P&gt;&lt;P&gt;When you merge, can you merge as many files as possible or do they always need to be merged 6 by 6?&lt;/P&gt;&lt;P&gt;Can you give an example of how you are going to use the index in the QueryRecord processor?&lt;/P&gt;</description>
      <pubDate>Sun, 13 Feb 2022 01:30:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336161#M232162</guid>
      <dc:creator>araujo</dc:creator>
      <dc:date>2022-02-13T01:30:46Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336162#M232163</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11191"&gt;@araujo&lt;/a&gt;&amp;nbsp;thanks for your reply&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This an example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have six csv files:.&lt;/P&gt;&lt;P&gt;file1.csv,&amp;nbsp;file2.csv,&amp;nbsp;file3.csv,&amp;nbsp;file4.csv have the same structure&lt;/P&gt;&lt;P&gt;file5.csv,&amp;nbsp;file6.csv have a different structure but the have some common columns that I will use in the QueryRecord&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In order to use The MergeContent, I should give a different.index attribute to each filename, it should be &lt;STRONG&gt;between 0 and 5&lt;/STRONG&gt; (as I have 6 files ).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Before the MergeContent, I use ListHDFS &amp;gt;&amp;gt; FechHDFS &amp;gt;&amp;gt; UpdateAttribute &lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;6 times&lt;/FONT&gt;&lt;/STRONG&gt; (for each file) which is not a good design as I can have more than 6 files in the future,&amp;nbsp;UpdateAttribute is where I assign the frangment.index attribute for each file.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;My question is, is there a way to have ONE&amp;nbsp;ListHDFS &amp;gt;&amp;gt; FechHDFS &amp;gt;&amp;gt; UpdateAttribute that get all files and assign a different frangment.index for each file (between 0 and 5) in one&amp;nbsp;UpdateAttribute processor&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;For your question about the QueryRecord:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I give a "metric"parameter for the 4 first files and another to two others in&amp;nbsp;UpdateAttribute processor&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;then in the QueryRecord I use this kind of query:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;select file1.col1, file1.col2, file2.col3, file2.col4,file3.col5,file3.col6
 from (
   select ID, file1.col1, file1.col2 where m = 'a'
 ) file1
 left join (
   select ID, file2.col3, file2.col4 from FLOWFILE where m = 'b'
 ) file2 on file1.ID_ART = file2.ID_ART
 left join (
   select ID, file3.col5,file3.col6 from FLOWFILE where m = 'c'
) file3 on file1.ID = file3.ID&lt;/LI-CODE&gt;</description>
      <pubDate>Sun, 13 Feb 2022 02:36:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336162#M232163</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-13T02:36:36Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336164#M232165</link>
      <description>&lt;P&gt;How do you differentiate the files in HDFS? Are they in different directories? Have different filenames?&lt;/P&gt;</description>
      <pubDate>Sun, 13 Feb 2022 03:40:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336164#M232165</guid>
      <dc:creator>araujo</dc:creator>
      <dc:date>2022-02-13T03:40:50Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336165#M232166</link>
      <description>&lt;P&gt;If the different types of files are in different directories in HDFS, for example, you can use Expression Language to set the values for &lt;FONT face="courier new,courier"&gt;fragment.index&lt;/FONT&gt; and &lt;FONT face="courier new,courier"&gt;metric&lt;/FONT&gt;, using a single ListHDFS -&amp;gt; FetchHDFS -&amp;gt; UpdateAttribute.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The expression below sets the value for metric according to the path where the file came from:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;FONT face="courier new,courier"&gt;${path:equals("/tmp/input/dir1"):ifElse("a",&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;${path:equals("/tmp/input/dir2"):ifElse("b",&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;${path:equals("/tmp/input/dir3"):ifElse("c",&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;${path:equals("/tmp/input/dir4"):ifElse("d",&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;${path:equals("/tmp/input/dir5"):ifElse("e",&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;${path:equals("/tmp/input/dir6"):ifElse("f",&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;"other")})})})})})}&lt;/FONT&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can do the same for &lt;FONT face="courier new,courier"&gt;fragment.index&lt;/FONT&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 13 Feb 2022 04:22:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336165#M232166</guid>
      <dc:creator>araujo</dc:creator>
      <dc:date>2022-02-13T04:22:15Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336175#M232174</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95752"&gt;@yamaga&lt;/a&gt;&amp;nbsp;, does the above help?&lt;/P&gt;</description>
      <pubDate>Sun, 13 Feb 2022 22:07:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336175#M232174</guid>
      <dc:creator>araujo</dc:creator>
      <dc:date>2022-02-13T22:07:43Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336207#M232191</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11191"&gt;@araujo&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks a lot for you implication.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That helped me to assign the metric attribute.&lt;/P&gt;&lt;P&gt;But not for fragment.index attribute because I might have more than one file coming from the same directory so I should assign different fragment.index for each one.&lt;/P&gt;&lt;P&gt;I also need to count the number of incoming files in order to assign the fragment.count attribute.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Feb 2022 10:43:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336207#M232191</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-14T10:43:23Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi - MergeContent - Multiple CSV files - counter</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336208#M232192</link>
      <description>&lt;P&gt;I think about Groovy script but did not find how to loop each flowfile or how to get the count of the files&lt;/P&gt;</description>
      <pubDate>Mon, 14 Feb 2022 10:45:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeContent-Multiple-CSV-files-counter/m-p/336208#M232192</guid>
      <dc:creator>yamaga</dc:creator>
      <dc:date>2022-02-14T10:45:12Z</dc:date>
    </item>
  </channel>
</rss>

