Created 09-16-2017 07:08 AM
Hi,
We are currently running NiFi as a single instance & we are planning to move to a clustered setup(3 node cluster).
Please consider the below sample flow,
ListFile -> updateAtrribute -> routeOnAttribute -> ExecuteStreamCommand(Executes a shell script) -> fetchFile -> updateAtrribute -> fetchFile -> putFile
Since we are going to run in cluster setup, we need to use Remote Processor groups to balance the load. We going to place the RPG after ListFile processor,
ListFile(on Primary Node) -> RPG
Input port -> updateAtrribute -> routeOnAttribute -> ExecuteStreamCommand(Executes a shell script) -> fetchFile -> updateAtrribute -> fetchFile -> putFile
My question is, if I want my ExecuteStreamCommand(which triggers a shell script) to execute only on the primary node & rest of the processors in all the nodes, can I go ahead and change the settings of processor to run 'On Primary Node'? Will it have any impact on the flow?
Thanks,
R.Rohit
Created 09-16-2017 12:22 PM
Hi @Rohit Ravishankar, yeah you will have impact on the flow.
You are going to have 3 node cluster and thinking to use RPG after ListFile processor.
let's consider you are having M01,M02,M03 are 3 NiFi nodes in the cluster and M01 is the Primary Node of the cluster.
1.So when ListFile processor runs then gives output to RPG, it is not guaranteed the file will goes to Primary node(M01).
2. RPG will take care of load balancing of nifi cluster and distributes the flowfiles accordingly.
3.If you are running ExecuteStreamCommand on Primary Node only, then it will triggers the command only if the flowfile will be on primary node at the time.in our assumption above processor will triggers the shell script only when flowfile will be on M01 node.
5.If RPG distributes the flowfile to M02(or)M03 nodes but the ExecuteStreamCommand processor is running on Primary Node only, in this cases those flow files won't triggers off the shell script.
Created 09-16-2017 12:22 PM
Hi @Rohit Ravishankar, yeah you will have impact on the flow.
You are going to have 3 node cluster and thinking to use RPG after ListFile processor.
let's consider you are having M01,M02,M03 are 3 NiFi nodes in the cluster and M01 is the Primary Node of the cluster.
1.So when ListFile processor runs then gives output to RPG, it is not guaranteed the file will goes to Primary node(M01).
2. RPG will take care of load balancing of nifi cluster and distributes the flowfiles accordingly.
3.If you are running ExecuteStreamCommand on Primary Node only, then it will triggers the command only if the flowfile will be on primary node at the time.in our assumption above processor will triggers the shell script only when flowfile will be on M01 node.
5.If RPG distributes the flowfile to M02(or)M03 nodes but the ExecuteStreamCommand processor is running on Primary Node only, in this cases those flow files won't triggers off the shell script.
Created 09-16-2017 04:46 PM
@Yash Thanks!