Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How to force ListFile proceesor to read directory on a particular node?

Explorer

I have a flow which include ListFile > FetchFile > (other processors) , which works fine in non-cluster mode.

Now I want to switch to cluster mode; I know that I can set ListFile to be run on primary node , then use Remote Process Group to FetchFile and do other processing.

But my files only exists on node A, and node B have no access to that folder. Also, zookeeper select primary node automatically, it is probably to throw exception when Primary node is node B.

So, are there any way to somehow 'force' ListFile processor always list file on node A?

1 REPLY 1

You currently can't target specific nodes, you can only target primary node which can be elected as any node and can fail over to another node.

Options...

1) Use a network mounted drive that is accessible to all nodes and then target ListFile to primary node so it only runs from one node, but if it fails over then any node can take over.

2) Don't use a NiFi cluster... since you are already limited to only being able to ListFile on one node, even in your current cluster if Node A went down then your ListFile part has no way of working until Node A comes back, so the cluster isn't providing you any failover here. Why not just run a stand alone NiFi on Node A and send the data via site-to-site to your cluster running somewhere else, or you could even run MiNiFi on Node A to send the data back to central NiFi.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.