<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Can I get the files in the middle of the data flow? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186490#M148592</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/17015/bearchui.html" nodeid="17015" target="_blank"&gt;@adam chui&lt;/A&gt;
&lt;/P&gt;&lt;P&gt;Sure..&lt;/P&gt;&lt;P&gt;I have created a directory called nifi_test in tmp directory.&lt;/P&gt;&lt;PRE&gt;[bash tmp]$ mkdir nifi_test&amp;lt;br&amp;gt;[bash tmp]$ cd nifi_test/
[bash  nifi_test]$ touch test.txt
[bash nifi_test]$ touch test1.txt
[bash nifi_test]$ touch test2.txt
[bash  nifi_test]$ ll
total 0
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test1.txt
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test2.txt
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test.txt&amp;lt;br&amp;gt;&lt;/PRE&gt;&lt;P&gt;Make sure nifi having access to pull the files in the directory.&lt;BR /&gt;Let's assume you are having dynamic generated directory attribute value as &lt;STRONG&gt;/tmp/&lt;/STRONG&gt;&lt;STRONG&gt;nifi_test/ &lt;/STRONG&gt;in middle of the flow.&lt;/P&gt;&lt;P&gt;Now we need to fetch all the files that are in &lt;STRONG&gt;/tmp/nifi_test &lt;/STRONG&gt;directory&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Flow:-&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72737-flow.png" style="width: 2454px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18795i67B1B5DEBBB3896D/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72737-flow.png" alt="72737-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;GenerateFlowFile configs:-&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;i have added new property as&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;directory
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;/tmp/nifi_test&lt;/PRE&gt;&lt;P&gt;now i'm having a flowfile with directory attribute with &lt;STRONG&gt;/tmp/nifi_test&lt;/STRONG&gt; as a value.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ExecuteStreamCommand configs:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72738-escommand.png" style="width: 1419px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18796i3B77FD5443945281/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72738-escommand.png" alt="72738-escommand.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Now i'm passing directory attribute as command attribute and listing all the files in the directory(/tmp/nifi_test)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;SplitText configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;BR /&gt;When you are having more than one file in the directory use this processor to split into individual flowfile&lt;/P&gt;&lt;P&gt;Change the below property value &lt;BR /&gt;&lt;STRONG&gt;Line Split Count
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;1&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Extract Text Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;we need to dynamically pull all the files from the directory so use extract text processor add new property as&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;filename
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;(.*)&lt;/PRE&gt;&lt;P&gt;in this processor we are extracting flowfile content and keeping for the filename attribute&lt;/P&gt;&lt;P&gt;Now we are having directory and filenames in the directory as attributes now.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Fetch File Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72739-fetchfile.png" style="width: 1369px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18797i2790ED611C5F77AC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72739-fetchfile.png" alt="72739-fetchfile.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;In File to Fetch property we are using directory and filename attributes to fetch the file/s from the directory, at the end flow screenshot you can see 3 files got fetched from the directory.&lt;/P&gt;&lt;P&gt;By following this way we are able to pull files middle of the flow.&lt;/P&gt;&lt;P&gt;I have added my flow.xml save/upload xml to your nifi istance and test it out.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/72740-fetch-files-189935.xml" target="_blank"&gt;fetch-files-189935.xml&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 08:11:18 GMT</pubDate>
    <dc:creator>Shu_ashu</dc:creator>
    <dc:date>2019-08-18T08:11:18Z</dc:date>
    <item>
      <title>Can I get the files in the middle of the data flow?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186487#M148589</link>
      <description>&lt;P&gt;Can I get the files in the middle of the data flow?&lt;/P&gt;&lt;P&gt;I know I can get files by getfile processors but it is limited to the beginning of the data flow, please advise how can I get retrieve the files in the middle of the data flow?&lt;/P&gt;&lt;P&gt;The reason is that I would like to pass the dynamic generated directory to be retrieve in the getfile / similar processor, it needs to be middle of the flow.&lt;/P&gt;</description>
      <pubDate>Thu, 10 May 2018 15:40:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186487#M148589</guid>
      <dc:creator>adamchui</dc:creator>
      <dc:date>2018-05-10T15:40:18Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get the files in the middle of the data flow?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186488#M148590</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/17015/bearchui.html" nodeid="17015"&gt;@adam chui&lt;/A&gt;
&lt;/P&gt;&lt;P&gt;If you are having &lt;STRONG&gt;fully qualified filename with the directory information&lt;/STRONG&gt; in your flow then you can use &lt;STRONG&gt;Fetch File Processor  &lt;/STRONG&gt; as this processor accepts incoming connection and pass the attributes(directory/filename) in&lt;STRONG&gt; File to Fetch Property&lt;/STRONG&gt; to pull the File into the flow.&lt;/P&gt;&lt;P&gt;If you are not having fully qualified filename then we need to list all the files in the directory by using &lt;STRONG&gt;ExecuteStreamCommand&lt;/STRONG&gt; processor by passing the &lt;STRONG&gt;dynamic generated directory name as an argument&lt;/STRONG&gt; to list all the files in the directory then using Fetch File processor you can pull the required files into data flow.&lt;/P&gt;&lt;P&gt;Please refer to &lt;A href="https://community.hortonworks.com/questions/186430/hi-everyone-is-there-any-processor-available-in-ni.html?childToView=186970#answer-186970" target="_blank"&gt;this&lt;/A&gt; link i have explained how to use ExecuteStreamCommand processor to list all the files in the directory,in addition to filter only the required filenames you can use RouteOnAttribute Processor before FetchFile Processor.&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;If the Answer addressed your question, &lt;STRONG&gt;Click on Accept button below to accept the answer, &lt;/STRONG&gt;That would be great help to Community users to find solution quickly for these kind of issues.&lt;/P&gt;</description>
      <pubDate>Thu, 10 May 2018 19:21:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186488#M148590</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2018-05-10T19:21:02Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get the files in the middle of the data flow?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186489#M148591</link>
      <description>&lt;P&gt;Could you give me a concrete example for that?&lt;/P&gt;&lt;P&gt;If you are not having fully qualified filename then we need to list all the files in the directory by using &lt;STRONG&gt;ExecuteStreamCommand&lt;/STRONG&gt; processor by passing the &lt;STRONG&gt;dynamic generated directory name as an argument&lt;/STRONG&gt; to list all the files in the directory then using Fetch File processor you can pull the required files into data flow.&lt;/P&gt;</description>
      <pubDate>Fri, 11 May 2018 05:38:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186489#M148591</guid>
      <dc:creator>adamchui</dc:creator>
      <dc:date>2018-05-11T05:38:38Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get the files in the middle of the data flow?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186490#M148592</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/17015/bearchui.html" nodeid="17015" target="_blank"&gt;@adam chui&lt;/A&gt;
&lt;/P&gt;&lt;P&gt;Sure..&lt;/P&gt;&lt;P&gt;I have created a directory called nifi_test in tmp directory.&lt;/P&gt;&lt;PRE&gt;[bash tmp]$ mkdir nifi_test&amp;lt;br&amp;gt;[bash tmp]$ cd nifi_test/
[bash  nifi_test]$ touch test.txt
[bash nifi_test]$ touch test1.txt
[bash nifi_test]$ touch test2.txt
[bash  nifi_test]$ ll
total 0
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test1.txt
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test2.txt
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test.txt&amp;lt;br&amp;gt;&lt;/PRE&gt;&lt;P&gt;Make sure nifi having access to pull the files in the directory.&lt;BR /&gt;Let's assume you are having dynamic generated directory attribute value as &lt;STRONG&gt;/tmp/&lt;/STRONG&gt;&lt;STRONG&gt;nifi_test/ &lt;/STRONG&gt;in middle of the flow.&lt;/P&gt;&lt;P&gt;Now we need to fetch all the files that are in &lt;STRONG&gt;/tmp/nifi_test &lt;/STRONG&gt;directory&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Flow:-&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72737-flow.png" style="width: 2454px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18795i67B1B5DEBBB3896D/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72737-flow.png" alt="72737-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;GenerateFlowFile configs:-&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;i have added new property as&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;directory
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;/tmp/nifi_test&lt;/PRE&gt;&lt;P&gt;now i'm having a flowfile with directory attribute with &lt;STRONG&gt;/tmp/nifi_test&lt;/STRONG&gt; as a value.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ExecuteStreamCommand configs:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72738-escommand.png" style="width: 1419px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18796i3B77FD5443945281/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72738-escommand.png" alt="72738-escommand.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Now i'm passing directory attribute as command attribute and listing all the files in the directory(/tmp/nifi_test)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;SplitText configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;BR /&gt;When you are having more than one file in the directory use this processor to split into individual flowfile&lt;/P&gt;&lt;P&gt;Change the below property value &lt;BR /&gt;&lt;STRONG&gt;Line Split Count
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;1&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Extract Text Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;we need to dynamically pull all the files from the directory so use extract text processor add new property as&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;filename
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;(.*)&lt;/PRE&gt;&lt;P&gt;in this processor we are extracting flowfile content and keeping for the filename attribute&lt;/P&gt;&lt;P&gt;Now we are having directory and filenames in the directory as attributes now.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Fetch File Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72739-fetchfile.png" style="width: 1369px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18797i2790ED611C5F77AC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72739-fetchfile.png" alt="72739-fetchfile.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;In File to Fetch property we are using directory and filename attributes to fetch the file/s from the directory, at the end flow screenshot you can see 3 files got fetched from the directory.&lt;/P&gt;&lt;P&gt;By following this way we are able to pull files middle of the flow.&lt;/P&gt;&lt;P&gt;I have added my flow.xml save/upload xml to your nifi istance and test it out.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/72740-fetch-files-189935.xml" target="_blank"&gt;fetch-files-189935.xml&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 08:11:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Can-I-get-the-files-in-the-middle-of-the-data-flow/m-p/186490#M148592</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T08:11:18Z</dc:date>
    </item>
  </channel>
</rss>

