Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Nifi QueryDatabaseTable duplicate records

Highlighted

Nifi QueryDatabaseTable duplicate records

New Contributor

I'm using QueryDatabaseTable on a 3-node HDF/Nifi cluster. What's happening is that once the process starts it simultaneously fetches three flowfiles containing three identical copies of the records, therefore causing three duplicates of each record being fetched. I'm suspecting that each node fetches it's own records at the same time without coordination between the nodes.

To test if this is the case I changed the configuration of the processor on the SCHEDULING tab, by changing the Execution value from "All nodes" to "Primary node". After applying this change the issue was resolved and only one copy of each record is fetched.

Is this a bug in Nifi or is this a normal behaviour? what If I need all nodes to participate in fetching records from the database and not overload the primary node?

Thanks

4 REPLIES 4

Re: Nifi QueryDatabaseTable duplicate records

Mentor

That's by design, you want to run this on primary node and distribute load further down the line. Here's an article describing a similar approach https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html

Re: Nifi QueryDatabaseTable duplicate records

Contributor

It is not make sense, QueryDatabaseTable has state in cluster scope. I think this is make it possible to get data parallel.

Or state exist for something else?

Re: Nifi QueryDatabaseTable duplicate records

Contributor

Do you find solution for it?

Re: Nifi QueryDatabaseTable duplicate records

New Contributor

No, I'm only using Primary node for those type of processors.

Don't have an account?
Coming from Hortonworks? Activate your account here