Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to preventing duplicates when ingesting into MYSQL using Nifi?

Solved Go to solution
Highlighted

How to preventing duplicates when ingesting into MYSQL using Nifi?

Contributor

I have a dataflow that ingest file from sftp into mysql and would like to know how to prevent an enormous amount of duplicates being ingested by nifi into mysql. Attached details below. Thanks

(1)Data flow

7853-updatednifiwrkfl.jpg

(2) count after nifi ingest into mysql

7854-countfromflowfile.jpg

(3)Original Data on SFTP

7855-originaldata.jpg

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How to preventing duplicates when ingesting into MYSQL using Nifi?

In many parts of your flow you have multiple relationships routed to the next processor when you probably want only one, some examples...

  • Between SplitText and ExtractText you have original and splits connected, but you probably only want splits here.
  • Between ExtractText and ReplaceText you have matched and unmatched, but you probably only want matched.
  • Between ReplaceText and PutSQL you have success and failure, but you probably only want success.
  • On PutSQL you have route failure, success, and retry back to itself, and you probably only want retry (you definitely don't want success routed back to itself).

You would most likely auto-terminate these other relationships (first tab when configured a processor).

View solution in original post

2 REPLIES 2
Highlighted

Re: How to preventing duplicates when ingesting into MYSQL using Nifi?

In many parts of your flow you have multiple relationships routed to the next processor when you probably want only one, some examples...

  • Between SplitText and ExtractText you have original and splits connected, but you probably only want splits here.
  • Between ExtractText and ReplaceText you have matched and unmatched, but you probably only want matched.
  • Between ReplaceText and PutSQL you have success and failure, but you probably only want success.
  • On PutSQL you have route failure, success, and retry back to itself, and you probably only want retry (you definitely don't want success routed back to itself).

You would most likely auto-terminate these other relationships (first tab when configured a processor).

View solution in original post

Highlighted

Re: How to preventing duplicates when ingesting into MYSQL using Nifi?

Contributor

@Bryan Bende thanks a lot for the help works perfect. I attached a snippet of the updated workflow should anyone experience such an issue in the future.Thanks again.

7856-workingnifiwrkflow.jpg

Don't have an account?
Coming from Hortonworks? Activate your account here