<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question NiFi MergeContent generating 2 output files instead of 1 file in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303496#M221577</link>
    <description>&lt;P&gt;I'm using&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;GetDateandServer processor to fetch file names ,&lt;/LI&gt;&lt;LI&gt;decompress the files ,&lt;/LI&gt;&lt;LI&gt;remove header using executestreamcommand Process&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Command arguments- 1d
Command path- sed 
IgnoreSTDIN_ False​&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;&lt;SPAN&gt;MergeContent Processor to merge files&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt; Merge strategy: Bin-Packing algorithm
    Merge format: Bin concatenation
    Merge data strategy: Do not merge uncommon metadata
    Min no of entries: 180
    Max no of entries: 1000
    Minimum Group Size: 60GB
    Max Bin age: 5 min
    Max no of bins: 1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;UpdateAttribute to Create a files name Compress files&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;putHDFS-To put data into HDFS&lt;/SPAN&gt;&lt;SPAN&gt;My problem is after ExecuteStreamcommand Processor is triggered in the queue I could see two positions with same value&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="queue.png" style="width: 999px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29008iA748303B88B414B1/image-size/large?v=v2&amp;amp;px=999" role="button" title="queue.png" alt="queue.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;P&gt;For every half hour I should get 1 output file but I could see 2 files as output. NIFi is running on 3 nodes, and list queue on 2 nodes. &lt;STRONG&gt;Files running on node1 is merged as 1 file and files running on node 2 is fetched as 2nd file so I'm getting 2 files.&lt;/STRONG&gt; Could you please let me know how to get one file Thank you in advance for your help&lt;/P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="after merge.png" style="width: 999px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29009i394C0C07983AD3BD/image-size/large?v=v2&amp;amp;px=999" role="button" title="after merge.png" alt="after merge.png" /&gt;&lt;/span&gt;Note: I have posted the same question previously but I couldn't see it again so I posted again&lt;/LI&gt;&lt;LI&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/69083"&gt;@Nifi&lt;/a&gt;&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;</description>
    <pubDate>Mon, 28 Sep 2020 13:43:57 GMT</pubDate>
    <dc:creator>Sru111</dc:creator>
    <dc:date>2020-09-28T13:43:57Z</dc:date>
    <item>
      <title>NiFi MergeContent generating 2 output files instead of 1 file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303496#M221577</link>
      <description>&lt;P&gt;I'm using&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;GetDateandServer processor to fetch file names ,&lt;/LI&gt;&lt;LI&gt;decompress the files ,&lt;/LI&gt;&lt;LI&gt;remove header using executestreamcommand Process&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Command arguments- 1d
Command path- sed 
IgnoreSTDIN_ False​&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;&lt;SPAN&gt;MergeContent Processor to merge files&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt; Merge strategy: Bin-Packing algorithm
    Merge format: Bin concatenation
    Merge data strategy: Do not merge uncommon metadata
    Min no of entries: 180
    Max no of entries: 1000
    Minimum Group Size: 60GB
    Max Bin age: 5 min
    Max no of bins: 1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;UpdateAttribute to Create a files name Compress files&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;putHDFS-To put data into HDFS&lt;/SPAN&gt;&lt;SPAN&gt;My problem is after ExecuteStreamcommand Processor is triggered in the queue I could see two positions with same value&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="queue.png" style="width: 999px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29008iA748303B88B414B1/image-size/large?v=v2&amp;amp;px=999" role="button" title="queue.png" alt="queue.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;P&gt;For every half hour I should get 1 output file but I could see 2 files as output. NIFi is running on 3 nodes, and list queue on 2 nodes. &lt;STRONG&gt;Files running on node1 is merged as 1 file and files running on node 2 is fetched as 2nd file so I'm getting 2 files.&lt;/STRONG&gt; Could you please let me know how to get one file Thank you in advance for your help&lt;/P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="after merge.png" style="width: 999px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29009i394C0C07983AD3BD/image-size/large?v=v2&amp;amp;px=999" role="button" title="after merge.png" alt="after merge.png" /&gt;&lt;/span&gt;Note: I have posted the same question previously but I couldn't see it again so I posted again&lt;/LI&gt;&lt;LI&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/69083"&gt;@Nifi&lt;/a&gt;&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Mon, 28 Sep 2020 13:43:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303496#M221577</guid>
      <dc:creator>Sru111</dc:creator>
      <dc:date>2020-09-28T13:43:57Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi MergeContent generating 2 output files instead of 1 file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303648#M221641</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/81869"&gt;@Sru111&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot from 2020-09-30 00-29-03.png" style="width: 374px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29029i0F42FAD967AABBED/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Screenshot from 2020-09-30 00-29-03.png" alt="Screenshot from 2020-09-30 00-29-03.png" /&gt;&lt;/span&gt;&lt;BR /&gt;Consider the MergeContent processor in the picture as your MergeContent processor.&lt;/P&gt;&lt;P&gt;Configure the queue(here, 'success') that acts as the upstream queue for your MergeContent processor.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot from 2020-09-30 00-32-24.png" style="width: 736px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29030i4F83DA22F826DA21/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot from 2020-09-30 00-32-24.png" alt="Screenshot from 2020-09-30 00-32-24.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Select the load balance strategy as Single node, then you will get all the files as input to only one of the nodes.&lt;/P&gt;&lt;P&gt;(Optional)&lt;BR /&gt;You can configure the donwstream queue of MergeContent processor to have the load balance strategy as Round Robin, so that the files are distributed among all the nodes in the cluster.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 19:06:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303648#M221641</guid>
      <dc:creator>PVVK</dc:creator>
      <dc:date>2020-09-29T19:06:45Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi MergeContent generating 2 output files instead of 1 file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303696#M221660</link>
      <description>&lt;P&gt;Thank you&amp;nbsp; &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/78607"&gt;@PVVK&lt;/a&gt;&amp;nbsp; for your solution,&amp;nbsp;I am unable to view&amp;nbsp; the option load strategy in the queue before the mergecontent processor So I have done this. Previously the fetchSFTP was set to execute on all nodes and I changed the option to execute on Primary node. As a result I am getting single file now. Please correct if if I am wrong&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Sru111_0-1601466876983.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29050i9112E9BF3A6E665D/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Sru111_0-1601466876983.png" alt="Sru111_0-1601466876983.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Also There is a delay while data is loading into HDFS using PUTHDFS Processor.After compression, while there is a change in size from MB to GB, it is being loaded after 1 hour.&lt;BR /&gt;Please find the screenshot below for your reference&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Sru111_0-1601456997796.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29049i8AB6F540F84E9116/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Sru111_0-1601456997796.png" alt="Sru111_0-1601456997796.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 11:58:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303696#M221660</guid>
      <dc:creator>Sru111</dc:creator>
      <dc:date>2020-09-30T11:58:45Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi MergeContent generating 2 output files instead of 1 file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303738#M221683</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/81869"&gt;@Sru111&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;Setting the FetchSFTP processor to run on primary node is fine.&lt;BR /&gt;But, if there are multiple files that you need to fetch from SFTP using the same processor, fetching of second file will happen only after you fetch first one (Similarly for the rest). But, you can fetch them simultaneously. So, using all the 3 nodes is preferred for FetchSFTP processor.&lt;/P&gt;&lt;P&gt;May I know which version of nifi you are using? I believe, load balance strategy was introduced in 1.11.0 (not sure) but, started working correctly in 1.11.4 version.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regarding PutHDFS, I don't have a clue about it! Sorry!&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 15:08:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303738#M221683</guid>
      <dc:creator>PVVK</dc:creator>
      <dc:date>2020-09-30T15:08:52Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi MergeContent generating 2 output files instead of 1 file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303744#M221688</link>
      <description>&lt;P&gt;Thank you &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/78607"&gt;@PVVK&lt;/a&gt;&amp;nbsp;.&lt;/P&gt;&lt;P&gt;My NiFi version is&amp;nbsp;&lt;SPAN&gt;1.5.0&lt;BR /&gt;If I set execute on all nodes in FetchSFTP,&amp;nbsp; Sometimes I am getting duplicate files fetched by different nodes&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 15:59:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303744#M221688</guid>
      <dc:creator>Sru111</dc:creator>
      <dc:date>2020-09-30T15:59:57Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi MergeContent generating 2 output files instead of 1 file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303746#M221690</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/81869"&gt;@Sru111&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;If possible, please update your nifi version to 1.11.4 or above. You can find load balancing option there. Otherwise, stick to your plan of using primary node only for FetchSFTP processor. You can still do it in your nifi using Remote Process Groups. But, it will become really complex with that.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 16:27:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-MergeContent-generating-2-output-files-instead-of-1/m-p/303746#M221690</guid>
      <dc:creator>PVVK</dc:creator>
      <dc:date>2020-09-30T16:27:46Z</dc:date>
    </item>
  </channel>
</rss>

