Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this solution

avatar
Super Mentor

@Umakanth 

From your shared log lines we can see two things:

1. "LOG 1" shows "StandardFlowFileRecord[uuid=345d9b6d-e9f7-4dd8-ad9a-a9d66fdfd902" and "LOG 2" shows "Successfully sent [StandardFlowFileRecord[uuid=f74eb941-a233-4f9e-86ff-07723940f012". This tells us these "RandomFile1154.txt" are two different FlowFiles. So does not look like RPG sent the same FlowFile twice, but rather sent two FlowFiles with each referencing the same content.  I am not sure how you have your LogAttribute processor configured, but you should look for the log output produced by these two uuids to learn more about these two FlowFiles.  I suspect from your comments you will only find one of these passed through your LogAttribute processor.

2. We can see from both logs that the above two FlowFiles actually point at the exact same content in the content_repository:  
"LOG 1" --> claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1599937413340-1, container=default, section=1], offset=1073154, length=237],offset=0,name=RandomFile1154.txt,size=237]
"LOG 2" --> claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1599937413340-1, container=default, section=1], offset=1109014, length=237],offset=0,name=RandomFile1154.txt,size=237]

This typically happens when a FlowFile becomes cloned somewhere in your dataflow.  For example: when a relationship from a processor is defined twice.

Since you saw that GetFile only ingested file once, that rules out GetFile as the source of this duplication.  But had it been GetFile, you would have not seen identical claim information.  LogAttribute only has a single "success" relationship, so if you had drawn two connections with "Success" relationship defined in both, you would have seen duplicates of every ingested content.  So this seems unlikely as well.  Next you have your PutFile processor.  This processor has both "success" and "failure" relationships.  I suspect the "success" relationship is assigned to the connection going to your Remote Process Group" and the "failure" relationship assigned to a connection that loops back on the PutFile itself(?).  Now if you had accidentally drawn the "failure" connection twice (one may be stack on top of the other), anytime a FlowFile failed in the putFile it would have been routed to one failure connection and cloned to other failure connection.  Then on time they both processed successfully by putFile and you end up with the original and clone sent to your RPG.

Hope this helps,
Matt

View solution in original post

Who agreed with this solution