Support Questions
Find answers, ask questions, and share your expertise

Nifi - Ingest Files from Local System using ListFile, ingest files irrespective of timestamp

New Contributor

We are trying to ingest files from local system to HDFS. Nifi Flow is


ListFile => FetchFile => UpdateAttribute => PutHDFS



ListFile Listing Strategy is Tracking Entities. Execution Primary node only.

Scenario -

On our local system, we have directory structure for each month data as -




Files timestamp corresponds to month. All these monthly folders continuously gets data.

When ListFile processor ingest files from 201911 directory and new files gets added into other folders (folders older than 201911, say 201903) these files are not picked by ListFile processor. I tried using different values for Entity Tracking Time Window property but no luck. Apparently Tracking Entities Listing Strategy is behaving like Tracking Timestamps(caching latest timestamp from ingested files and not ingesting any older timestamped files)

As far my understanding when we use ListFile with Listing Strategy as Tracking Entities, it will cache - Name, Size and Last modified timestamp for each flowfile and then keep Listing files which are not in the cache based on these attributes.

  1. Why ListFile processor is not picking files with my current configuration
  2. So, basically I want to continuously ingest new files based on filename (irrespective of timestamp) and skip already ingested files - is there any workaround to achieve this.







P.S. -  I've asked same question on StackOverflow,