Created 02-27-2017 09:15 AM
I'm using QueryDatabaseTable on a 3-node HDF/Nifi cluster. What's happening is that once the process starts it simultaneously fetches three flowfiles containing three identical copies of the records, therefore causing three duplicates of each record being fetched. I'm suspecting that each node fetches it's own records at the same time without coordination between the nodes.
To test if this is the case I changed the configuration of the processor on the SCHEDULING tab, by changing the Execution value from "All nodes" to "Primary node". After applying this change the issue was resolved and only one copy of each record is fetched.
Is this a bug in Nifi or is this a normal behaviour? what If I need all nodes to participate in fetching records from the database and not overload the primary node?
Thanks
Created 02-27-2017 12:51 PM
That's by design, you want to run this on primary node and distribute load further down the line. Here's an article describing a similar approach https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html
Created 03-06-2017 02:29 PM
It is not make sense, QueryDatabaseTable has state in cluster scope. I think this is make it possible to get data parallel.
Or state exist for something else?
Created 03-06-2017 02:29 PM
Do you find solution for it?
Created 03-22-2017 10:03 AM
No, I'm only using Primary node for those type of processors.