Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

EvaulateJSONPath processor configuration

avatar

hello everyone,

I have a query regarding EvaulateJSONPath processor configuration.

whats the significance of Execution in Scheduling of EvaulateJSONPath.

I have observed when I select Primary node instead of All nodes, Queue builds up as shown in image and stays the same, dosen't decrease , but if I select All nodes Queue gradually becomes 0.

Please suggest.

screen-shot-2019-01-17-at-60906-pm.png

1 ACCEPTED SOLUTION

avatar
Master Guru
@Manish Parab

Your GetMongo processor is running on All nodes that means same data is pulled on all nodes..
If you select EvaluateJsonPath processor to run on Only primary node then all the other nodes flowfiles will be left queue before EvaluateJsonPath processor, Because you are not processing flowfiles that are pulled from all other nodes except of PrimaryNode.

Run GetMongoProcessor only on primary node and keep EvaluateJsonPath Processor to run on all nodes, Reason to keep EvaluateJsonPath processor on all nodes if NiFi primary node changed then EvaluateJsonPath processor not going to processor the flowfile that are listed on old PrimaryNode.

View solution in original post

4 REPLIES 4

avatar
Master Guru
@Manish Parab

Your GetMongo processor is running on All nodes that means same data is pulled on all nodes..
If you select EvaluateJsonPath processor to run on Only primary node then all the other nodes flowfiles will be left queue before EvaluateJsonPath processor, Because you are not processing flowfiles that are pulled from all other nodes except of PrimaryNode.

Run GetMongoProcessor only on primary node and keep EvaluateJsonPath Processor to run on all nodes, Reason to keep EvaluateJsonPath processor on all nodes if NiFi primary node changed then EvaluateJsonPath processor not going to processor the flowfile that are listed on old PrimaryNode.

avatar

Thats a really good answer, I tried doing that and it works, thanks @Shu , can you please give me more insight on "same data is pulled on all nodes "

avatar
Master Guru

@Manish Parab

Sure, In NiFi processors that triggers the flow(scheduled to run in cron) we need to run the processors on primary node only and running on all nodes means we are triggering n times the same processor on each node.
That means NiFi each node works with data specifically that receives, in case of Getmongo processor(triggers the flow in this case) when running on all nodes will pull same data.

-

Run GetMongo(source processor) to run on primary node then distribute the load using RemoteProcessorGroups (or) connectionloadbalancing across the cluster.

avatar

makes sense now, why I was getting duplicate data from mongo... when I was running GetMongo on all nodes. thanks again @Shu