Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Issue NiFi State Management for ListS3 | Triggering flow twice or more for same file

Issue NiFi State Management for ListS3 | Triggering flow twice or more for same file

New Contributor

Hi Team, we noticed ListS3 processor is processing same file twice or more, while there was no change in the source file.

As we go though the "Component State" which clearly says, nifi manages the file states with latest time-stamp and list only when a new file has been added or existing file has been modified.
We are seeing ListS3 processor is reprocessing same file from Data Provenance which has the same s3.lastModifed property value, We believe file should not be re-processed unless its changed per "State Management" docs.

Kindly help us on this issue. Let us know for any information needed.

Note: ListS3 Processor has been setup to run only primary node.

3 REPLIES 3

Re: Issue NiFi State Management for ListS3 | Triggering flow twice or more for same file

New Contributor

This is bug. There is an open defect also the workaround solutions.

https://issues.apache.org/jira/browse/NIFI-4715

Re: Issue NiFi State Management for ListS3 | Triggering flow twice or more for same file

New Contributor

Thank You @Milan Das. So the workaround you suggested require changes to nifi code ?

For now I had added a DetectDuplicate processor.

Re: Issue NiFi State Management for ListS3 | Triggering flow twice or more for same file

New Contributor

Yes a code fix in NIFI is needed. Work around is DetectDuplicate processor.