Support Questions

Find answers, ask questions, and share your expertise

QueryDatabase to run in distributed manner

Explorer

I have used Querydatabase provessor of NiFi to extract data from Oracle in incremental fashion. I have 2 nodes cluster of NiFi with a good configuration. But as QueryDatabase is designed to run only on the primary node, my one node is over-utilized and another one is under-utilized. How can I make my cluster fully utilized. I have multiple tables in Oracle and the load is getting incrase on only one of the node.

4 REPLIES 4

Super Guru

@Ishan Kumar

You have to use GenerateTableFetch processor and this processor Generates SQL select queries that fetch "pages" of rows from a table.Then feed the success relation to RemoteProcessor group to distribute the load across the cluster.

Then use ExecuteSql processor to run the sql queries that are generated by GenerateTableFetch processor.

Refer to these links for more details regarding remote processor groups and refer to these link1, link2, link3 regarding GenerateTableFetch processor usage.

Super Guru
@Ishan Kumar

With QueryDatabaseTable processor we cannot achieve this case as the processor designed to be running on Primary node only and pulls all the incremental changes in the next run, but you can make use some of properties of QDT processor like Max Rows Per Flow File.etc and set property value to split the rows per flowfile then feed the success relation to Remote Processor Group by using this way you can distribute the load across the cluster although the processor is running only on primary node.

For more details regarding configs of QDT processor refer to this link.

Explorer

@Shu

Your idea make sense. But can I achieve the same with QueryDatabase processor as I dont waana to change the design now. There are multiple flows running.

Explorer

Thanks @Shu. I will try this approach. Thanks.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.