Support Questions

Find answers, ask questions, and share your expertise

How can I List S3 processor to list only recent files

avatar

How can use list s3 processor to list only recent files that being ingested into my s3 bucket using Nifi

Thank you

1 REPLY 1

avatar
Master Guru
@Suhas Reddy

All list processors in NiFi are stateful processors i.e. these processors stores the state until last time of execution and then the next run it will pull only the delta files(files added after the stored state) from the S3 buckets/directories.

To check the state RightClick on the processor and go to view state then you will find the stored state of the processor and to clear off the state click on clear state button then the processor will run from start(list all the files from the bucket in the first run).

Then in the next run will pull only the newly added files in the bucket.

`Configure the ListS3 processor with all the mandatory properties then processor and schedule the processor to run then processor will get the files incrementally.`

ListS3 Description:

1.Retrieves a listing of objects from an S3 bucket. For each object that is listed, creates a FlowFile that represents the object so that it can be fetched in conjunction with FetchS3Object. 
2.This Processor is designed to run on Primary Node only in a cluster. If the primary node changes, the new Primary Node will pick up where the previous node left off without duplicating all of the data.

-

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.