Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Are values of stateful variables in nifi cluster gets set to initial value again on nifi clusterrestart?

avatar
Rising Star

@mattburgess,@markpayne Hi All,

I am using stateful variables to generate an incremental batchid value using update attribute processor,this runs in cluster and have set the processor to run on all nodes. But the batchid values generated are not in an incremental fashion,the processor is missing some values sometime or generating a duplicate value.Is it happening due to restart of the cluster and wiping off of stateful variables data? could you please suggest how can i persist stateful variables data? attached update attribute configuration for reference.

Please s107311-updateattribute-properties.pnguggest!!

1 ACCEPTED SOLUTION

avatar
Super Mentor

@sri chaturvedi

-

The UpdateAttribute processor state capability can only store state locally on each node in the cluster. Other nodes in your cluster have no idea what local state value has been stored on other nodes in your cluster.
-

So I suspect one or both of the following is occurring:
1. The upstream dataflow data originates on the Primary node only (For example source ingest processor runs "primary node" execution). On NiFi restart a different node in your 3 node cluster is elected as primary node and now the FlowFiles traversing this dataflow are on a different node where there is no previous state, so it appears as if state started over. This would explain the appearance of a reset on cluster restart.
2. This UpdateAttribute processor is receiving inbound FlowFiles on all three nodes. Since each node stores its own state locally for this processor, each would be incrementing its own count independently of each other. This would explain the -duplicate state values seen on some FlowFiles.

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.



View solution in original post

2 REPLIES 2

avatar
Super Mentor

@sri chaturvedi

-

The UpdateAttribute processor state capability can only store state locally on each node in the cluster. Other nodes in your cluster have no idea what local state value has been stored on other nodes in your cluster.
-

So I suspect one or both of the following is occurring:
1. The upstream dataflow data originates on the Primary node only (For example source ingest processor runs "primary node" execution). On NiFi restart a different node in your 3 node cluster is elected as primary node and now the FlowFiles traversing this dataflow are on a different node where there is no previous state, so it appears as if state started over. This would explain the appearance of a reset on cluster restart.
2. This UpdateAttribute processor is receiving inbound FlowFiles on all three nodes. Since each node stores its own state locally for this processor, each would be incrementing its own count independently of each other. This would explain the -duplicate state values seen on some FlowFiles.

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.



avatar
Rising Star

@Matt Clarke,@matt burgess Exactly the second point is happening, each node is generating its own value incrementing from last value it has stored in its local state. So which processor or method should i use to generate an incremental batchid (batch1,batch2...so on) since update attribute is messing values when running on cluster. or is there any property by which updateattribute processors on all nodes can pickup each others's last state variable?..please suggest