Support Questions

Find answers, ask questions, and share your expertise

Why does MergeContent have a "duplicate entry" error?

avatar
Rising Star

I almost done creating the DataFlow that ingests vehicle location data from traffic stream simulator, does some filtering on that data and lands the data into a new JSON file. The problem occurs with the mergeContent processor. It shows an error of "duplicate entry." I set UnpackContent's and MergeContent's appropriate property value to zip. Why do I receive this error?

Here is an image of the NiFi DataFlow with the error message:

4292-error-2nd-mergecontent.png

GetFile -> UnpackContent -> ControlRate -> EvaluateXPath -> SplitXml -> EvaluateXPath -> RouteOnAttribute -> AttributesToJSON -> MergeContent -> PutFile

1 ACCEPTED SOLUTION

avatar
Rising Star

I figured out the solution to the issue. SplitXml splits the children of the parent element in XML file into separate flowfiles. When that happens, the flowfiles receive duplicate names cause the children share the same filename as the original XML file. So, I used a updateAttribute processor to give each flowfile a distinct name. Then when the flowfiles were routed to mergeContent processor, I looked at the list queue to check if the flowfiles all had distinct names and they did. I checked the directory the putfile stored the flowfiles into and they were present as expected. Thank you for the quick response!

Here is an image of my updated dataflow:

4293-errorsolvedmergecontent.png

View solution in original post

4 REPLIES 4

avatar

Hi @jmedel, could you share the configuration of your MergeContent processor?

Also, do you have files with the same name in multiple zip files?

avatar
Rising Star

I figured out the solution to the issue. SplitXml splits the children of the parent element in XML file into separate flowfiles. When that happens, the flowfiles receive duplicate names cause the children share the same filename as the original XML file. So, I used a updateAttribute processor to give each flowfile a distinct name. Then when the flowfiles were routed to mergeContent processor, I looked at the list queue to check if the flowfiles all had distinct names and they did. I checked the directory the putfile stored the flowfiles into and they were present as expected. Thank you for the quick response!

Here is an image of my updated dataflow:

4293-errorsolvedmergecontent.png

avatar

@jmedel I think you should answer your question with your answer and accept your own answer. It will mark the thread as answered and will help people to find relevant information when dealing with the same issue 😉

avatar
Rising Star

Thank you Pierre!