Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to preventing duplicates when ingesting into MYSQL using Nifi?

avatar
Rising Star

I have a dataflow that ingest file from sftp into mysql and would like to know how to prevent an enormous amount of duplicates being ingested by nifi into mysql. Attached details below. Thanks

(1)Data flow

7853-updatednifiwrkfl.jpg

(2) count after nifi ingest into mysql

7854-countfromflowfile.jpg

(3)Original Data on SFTP

7855-originaldata.jpg

1 ACCEPTED SOLUTION

avatar
Master Guru

In many parts of your flow you have multiple relationships routed to the next processor when you probably want only one, some examples...

  • Between SplitText and ExtractText you have original and splits connected, but you probably only want splits here.
  • Between ExtractText and ReplaceText you have matched and unmatched, but you probably only want matched.
  • Between ReplaceText and PutSQL you have success and failure, but you probably only want success.
  • On PutSQL you have route failure, success, and retry back to itself, and you probably only want retry (you definitely don't want success routed back to itself).

You would most likely auto-terminate these other relationships (first tab when configured a processor).

View solution in original post

2 REPLIES 2

avatar
Master Guru

In many parts of your flow you have multiple relationships routed to the next processor when you probably want only one, some examples...

  • Between SplitText and ExtractText you have original and splits connected, but you probably only want splits here.
  • Between ExtractText and ReplaceText you have matched and unmatched, but you probably only want matched.
  • Between ReplaceText and PutSQL you have success and failure, but you probably only want success.
  • On PutSQL you have route failure, success, and retry back to itself, and you probably only want retry (you definitely don't want success routed back to itself).

You would most likely auto-terminate these other relationships (first tab when configured a processor).

avatar
Rising Star

@Bryan Bende thanks a lot for the help works perfect. I attached a snippet of the updated workflow should anyone experience such an issue in the future.Thanks again.

7856-workingnifiwrkflow.jpg