Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to equally distribute flow files generated from GenerateTableFetch Processor among multiple nodes in NiFi Cluster?

How to equally distribute flow files generated from GenerateTableFetch Processor among multiple nodes in NiFi Cluster?

New Contributor

Hi,

I am working on scenario where I need to pull data from table in Sql Server and Store it to some place.

Currently, I have designed below workflow (at high level):

GenerateTableFetch --> Execute Sql --> ConvertRecord --> PutFile

I am working on a NiFi cluster and observed that, each node on cluster is executing this workflow individually and keeping output files on its local storage.

I am looking for a way where output of GenerateTableFetch i.e. flow files with queries will get distribute equally to let say 4 nodes. Each node will have unique sets of queries to execute by Execute Sql task.

Here, for example, if I have 12 GB Data and 4 nodes, and GenerateTableFetch is generating query to pull 1 GB data in one flow file, then each node should share the work and individually pull 3 GB data.

Can someone please help me to achieve this?

Also, if each node will deal to pull specific set of data then, What will happen when that node will go down? Is there any way such that if some node is failed then its work items will be shared with other nodes in cluster?

Can you guys help? @Matt Clarke @Bryan Bende

1 REPLY 1
Highlighted

Re: How to equally distribute flow files generated from GenerateTableFetch Processor among multiple nodes in NiFi Cluster?

Don't have an account?
Coming from Hortonworks? Activate your account here