<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: FetchHDFS Process to fetch Nested data in HDFS in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185139#M254799</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3741/akashsabarad.html" nodeid="3741"&gt;@Akash S&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The ListHDFS processor records state so that only new files are listed.  The processor also has a configuration option for recursing subdirectories.  You could set the directory to only  /MajorData/Location/ and let it list all files from the subdirectories.  As new subdirectories are created, the files within those new directories will get listed.&lt;/P&gt;&lt;P&gt;If that does not work for you, the NiFi expression language (EL) statement that you are looking for would look something like this for the directory:&lt;/P&gt;&lt;PRE&gt;/MajorData/Location/${now():format('yyyy/MM/dd')}&lt;/PRE&gt;&lt;P&gt;The above would cause Nifi to only look in the target directory fro Files until the day changed.  I am not sure the rate at which files are written in to these target directories, but be mindful that if a file is add between runs of the listHDFS processor and the day changes between those runs, that file will not get listed using the above EL statement.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
    <pubDate>Thu, 13 Jul 2017 19:42:39 GMT</pubDate>
    <dc:creator>MattWho</dc:creator>
    <dc:date>2017-07-13T19:42:39Z</dc:date>
    <item>
      <title>Apache NiFi HDFS Ingestion: Using ListHDFS for Recursive Traversal of Nested Directories</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185138#M254798</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;
&lt;P&gt;I want to fetch the data that is stored in HDFS using FetchHDFS processor .&lt;/P&gt;
&lt;P&gt;The folder structure to store our data is like /MajorData/Location/Year/Month/Day/file1.txt (/MajorData/Location/2017/01/01/file1.txt) As the day changes the folder structure will change to /MajorData/Location/2017/01/02/file2.txt&lt;/P&gt;
&lt;P&gt;How can I write a Nifi expression which will traverse through all the folders, fetch the data in NiFi?&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jun 2026 13:40:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185138#M254798</guid>
      <dc:creator>akash_sabarad</dc:creator>
      <dc:date>2026-06-03T13:40:12Z</dc:date>
    </item>
    <item>
      <title>Re: FetchHDFS Process to fetch Nested data in HDFS</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185139#M254799</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3741/akashsabarad.html" nodeid="3741"&gt;@Akash S&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The ListHDFS processor records state so that only new files are listed.  The processor also has a configuration option for recursing subdirectories.  You could set the directory to only  /MajorData/Location/ and let it list all files from the subdirectories.  As new subdirectories are created, the files within those new directories will get listed.&lt;/P&gt;&lt;P&gt;If that does not work for you, the NiFi expression language (EL) statement that you are looking for would look something like this for the directory:&lt;/P&gt;&lt;PRE&gt;/MajorData/Location/${now():format('yyyy/MM/dd')}&lt;/PRE&gt;&lt;P&gt;The above would cause Nifi to only look in the target directory fro Files until the day changed.  I am not sure the rate at which files are written in to these target directories, but be mindful that if a file is add between runs of the listHDFS processor and the day changes between those runs, that file will not get listed using the above EL statement.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2017 19:42:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185139#M254799</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2017-07-13T19:42:39Z</dc:date>
    </item>
    <item>
      <title>Re: FetchHDFS Process to fetch Nested data in HDFS</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185140#M254800</link>
      <description>&lt;P&gt;Thank you Matt, ListHDFS was a good hint. I was able to accomplish my task with you inputs.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 01:36:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-HDFS-Ingestion-Using-ListHDFS-for-Recursive/m-p/185140#M254800</guid>
      <dc:creator>akash_sabarad</dc:creator>
      <dc:date>2017-07-17T01:36:05Z</dc:date>
    </item>
  </channel>
</rss>

