<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFi equivalent to flume spooling directory source in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97806#M11297</link>
    <description>&lt;P&gt;Perfect! Thank you.&lt;/P&gt;</description>
    <pubDate>Fri, 04 Dec 2015 02:23:58 GMT</pubDate>
    <dc:creator>gbraccialli3</dc:creator>
    <dc:date>2015-12-04T02:23:58Z</dc:date>
    <item>
      <title>NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97801#M11292</link>
      <description>&lt;P&gt;I'm trying to use NiFi to replace a basic flume agent that uses Spooling Directory Source and send content to Kafka.&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source"&gt;http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The functionality of Flume Spooling Directory source is describe in flume documentation as:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;"This source lets you ingest data by placing files to be ingested into a “spooling” directory on disk. This source will watch the specified directory for new files, and will parse events out of new files as they appear. The event parsing logic is pluggable. After a given file has been fully read into the channel, it is renamed to indicate completion (or optionally deleted)."&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Looking for hints on how to do it with NiFi.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Dec 2015 13:35:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97801#M11292</guid>
      <dc:creator>gbraccialli3</dc:creator>
      <dc:date>2015-12-03T13:35:26Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97802#M11293</link>
      <description>&lt;P&gt;To do this you build a pipeline with the GetFiles processor, this can pick up files, and delete / move them afterwards (just as the spooldir source does). For the batching functionality you can use MergeContent, or other batching mechanisms on downstream Put processors.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Dec 2015 21:42:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97802#M11293</guid>
      <dc:creator>sball</dc:creator>
      <dc:date>2015-12-03T21:42:15Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97803#M11294</link>
      <description>&lt;P&gt;Quite often people are actually looking for tailing a file line-by-line. For those cases, a new TailFile processor in NiFi 0.4.0 works better, and it also includes advanced features like detecting (avoiding) duplicate entries on restart, understands log file rolling patterns (so it's not just reading from an active log file, but can also start from the beginning of time), etc.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Dec 2015 22:50:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97803#M11294</guid>
      <dc:creator>andrewg</dc:creator>
      <dc:date>2015-12-03T22:50:12Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97804#M11295</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/5099/nifi-equivalent-to-flume-spolling-directory-source.html#"&gt;@Simon Elliston Ball&lt;/A&gt; Thank you. I'm just starting with NiFi. Would you have a sample template that does it?&lt;/P&gt;</description>
      <pubDate>Fri, 04 Dec 2015 00:50:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97804#M11295</guid>
      <dc:creator>gbraccialli3</dc:creator>
      <dc:date>2015-12-04T00:50:27Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97805#M11296</link>
      <description>&lt;P&gt;I've attached a very rough template of an SFTP pipeline that does what you are looking for &lt;A rel="user" href="https://community.cloudera.com/users/238/gbraccialli.html" nodeid="238"&gt;@Guilherme Braccialli&lt;/A&gt;. You could replace the initial GetSFTP processor with a GetFile processor and have pretty much the same functionality that you are looking for. &lt;/P&gt;&lt;P&gt;It polls a directory looking for *.DONE files every 5 seconds. When it gets them it starts pushing them through the pipeline and encrypting/compressing them and dropping them off in HDFS and another SFTP directory. The "Keep Source File" property in the GetFile and GetSFTP processor allows you to delete the file after it is picked up so it isn't captured multiple times.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Dec 2015 02:21:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97805#M11296</guid>
      <dc:creator>bwilson</dc:creator>
      <dc:date>2015-12-04T02:21:47Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97806#M11297</link>
      <description>&lt;P&gt;Perfect! Thank you.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Dec 2015 02:23:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97806#M11297</guid>
      <dc:creator>gbraccialli3</dc:creator>
      <dc:date>2015-12-04T02:23:58Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97807#M11298</link>
      <description>&lt;P&gt;Thanks for this valuable post &lt;A rel="user" href="https://community.cloudera.com/users/99/bwilson.html" nodeid="99"&gt;@Brandon Wilson&lt;/A&gt;. I wonder if I can create a folder structure in hdfs as in my source folder structure.&lt;/P&gt;&lt;P&gt;For example in my mounted directory there were store folders and there are day folders in this store folders: Store1 -&amp;gt; Day1, Store1 -&amp;gt; Day2, Store2 -&amp;gt; Day1 etc. I set Remote Path as my mounted directory. NIFI can read this folder and subfolders to get xml's and can write hdfs them. Can I write these xml's to the same directory as in file system?&lt;/P&gt;&lt;P&gt;Is there any parametric property value for this aim? Or do I have to use another process in NIFI?&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Mon, 08 Aug 2016 18:19:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97807#M11298</guid>
      <dc:creator>ardicberk</dc:creator>
      <dc:date>2016-08-08T18:19:44Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi equivalent to flume spooling directory source</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97808#M11299</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12357/ardicberk.html" nodeid="12357"&gt;@Berk Ardıç&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;You can achieve this type of functionality by modifying a couple additional pieces of the flow. First, you can set the GetSFTP to search recursively from your mounted directory. This will traverse the entire path rooted at your target location so it will pick up files from Store1 and Store2 directories. You then have the ability to limit this by leveraging the regex filter properties for the path and the file. This will handle the pickup side of flow.&lt;/P&gt;&lt;P&gt;Then, on the delivery side, you can leverage the path attribute from the flowfile to construct a new destination in HDFS that mirrors the structure from the pickup directory. You can use NiFi expression language in the destination for PutHDFS to construct the appropriate path. &lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Aug 2016 22:14:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-equivalent-to-flume-spooling-directory-source/m-p/97808#M11299</guid>
      <dc:creator>bwilson</dc:creator>
      <dc:date>2016-08-08T22:14:26Z</dc:date>
    </item>
  </channel>
</rss>

