Support Questions

Find answers, ask questions, and share your expertise

Getting duplicate flow files after CalculateRecordStats

avatar
Explorer

Hello,

 

As per shown in image

 

Only 4 flow files are getting sent by UpdateAttribute and when it passes through CalculateRecordStats processor its sending out 8 flowfiles. Duplicates are getting created for Each flow file. 

 

Can you please advise?

 

Thank you.

f3b53673-630b-4473-b095-63184c64971a.jpg

Elsaa_0-1654642834196.png

 

4 REPLIES 4

avatar
Super Guru

Hi,

 

Which version of Nifi are you using? Can you share sample of the csv content and what you do in the update attribute to see if I can replicate it? Im using nifi 1.16 and when a try it ont he sample provided by the CalculateRecordStats help (under additional information) I dont get any duplicates.

avatar
Explorer

NiFi version is 1.11.4

I am using simple csv and txt and xml files

Sometimes it has data rows and sometimes it just have headers

avatar
Super Guru

Can you try the following Json and see if you get any duplicates:

[
{
"sport": "Soccer",
"name": "John Smith"
},
{
"sport": "Soccer",
"name": "John Smith"
},
{
"sport": "tennis",
"name": "John Smith"
},
{
"sport": "tennis",
"name": "John Smith"
},
{
"sport": "tennis",
"name": "John Smith"
}

]

Basically Create GenerateFlowFile processor and add the json above as the CustomText. Then Add CalculateRecordStats and provide the JsonTreeReader in the RecrodReader. Run and see if you get any duplicates. If so then I would recommend you upgrade to later version.

avatar
Master Mentor

@Elsaa 

Couple things I would check first.
1. Make sure you do not have two Success relationship connections stacked on top of each other between the "UpdateAttribute" processor and the "CalculateRecordStats" processor.  They processor show 4 in and 12 out which makes be think 4 went to three different success connections.  You can double click on a connection line to add a bend point that would allow you to click and drag that bend point to see if there is another connection under it.

2. If above is not the issue, take a look at the provenance data for your 8 generated FlowFiles to see at what point in your dataflow the clones happened.

If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt