Created on 05-17-2016 04:52 PM - edited 08-18-2019 05:39 AM
I almost done creating the DataFlow that ingests vehicle location data from traffic stream simulator, does some filtering on that data and lands the data into a new JSON file. The problem occurs with the mergeContent processor. It shows an error of "duplicate entry." I set UnpackContent's and MergeContent's appropriate property value to zip. Why do I receive this error?
Here is an image of the NiFi DataFlow with the error message:
GetFile -> UnpackContent -> ControlRate -> EvaluateXPath -> SplitXml -> EvaluateXPath -> RouteOnAttribute -> AttributesToJSON -> MergeContent -> PutFile
Created on 05-17-2016 05:45 PM - edited 08-18-2019 05:39 AM
I figured out the solution to the issue. SplitXml splits the children of the parent element in XML file into separate flowfiles. When that happens, the flowfiles receive duplicate names cause the children share the same filename as the original XML file. So, I used a updateAttribute processor to give each flowfile a distinct name. Then when the flowfiles were routed to mergeContent processor, I looked at the list queue to check if the flowfiles all had distinct names and they did. I checked the directory the putfile stored the flowfiles into and they were present as expected. Thank you for the quick response!
Here is an image of my updated dataflow:
Created 05-17-2016 04:55 PM
Hi @jmedel, could you share the configuration of your MergeContent processor?
Also, do you have files with the same name in multiple zip files?
Created on 05-17-2016 05:45 PM - edited 08-18-2019 05:39 AM
I figured out the solution to the issue. SplitXml splits the children of the parent element in XML file into separate flowfiles. When that happens, the flowfiles receive duplicate names cause the children share the same filename as the original XML file. So, I used a updateAttribute processor to give each flowfile a distinct name. Then when the flowfiles were routed to mergeContent processor, I looked at the list queue to check if the flowfiles all had distinct names and they did. I checked the directory the putfile stored the flowfiles into and they were present as expected. Thank you for the quick response!
Here is an image of my updated dataflow:
Created 05-17-2016 05:54 PM
@jmedel I think you should answer your question with your answer and accept your own answer. It will mark the thread as answered and will help people to find relevant information when dealing with the same issue 😉
Created 05-17-2016 05:57 PM
Thank you Pierre!