Support Questions

srijitachaturve · ‎03-20-2019

@mattburgess,@markpayne Hi All,

I am using stateful variables to generate an incremental batchid value using update attribute processor,this runs in cluster and have set the processor to run on all nodes. But the batchid values generated are not in an incremental fashion,the processor is missing some values sometime or generating a duplicate value.Is it happening due to restart of the cluster and wiping off of stateful variables data? could you please suggest how can i persist stateful variables data? attached update attribute configuration for reference.

Please suggest!!

MattWho · ‎03-20-2019

@sri chaturvedi

-

The UpdateAttribute processor state capability can only store state locally on each node in the cluster. Other nodes in your cluster have no idea what local state value has been stored on other nodes in your cluster.
-

So I suspect one or both of the following is occurring:
1. The upstream dataflow data originates on the Primary node only (For example source ingest processor runs "primary node" execution). On NiFi restart a different node in your 3 node cluster is elected as primary node and now the FlowFiles traversing this dataflow are on a different node where there is no previous state, so it appears as if state started over. This would explain the appearance of a reset on cluster restart.
2. This UpdateAttribute processor is receiving inbound FlowFiles on all three nodes. Since each node stores its own state locally for this processor, each would be incrementing its own count independently of each other. This would explain the -duplicate state values seen on some FlowFiles.

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

View solution in original post

MattWho · ‎03-20-2019

@sri chaturvedi

-

The UpdateAttribute processor state capability can only store state locally on each node in the cluster. Other nodes in your cluster have no idea what local state value has been stored on other nodes in your cluster.
-

So I suspect one or both of the following is occurring:
1. The upstream dataflow data originates on the Primary node only (For example source ingest processor runs "primary node" execution). On NiFi restart a different node in your 3 node cluster is elected as primary node and now the FlowFiles traversing this dataflow are on a different node where there is no previous state, so it appears as if state started over. This would explain the appearance of a reset on cluster restart.
2. This UpdateAttribute processor is receiving inbound FlowFiles on all three nodes. Since each node stores its own state locally for this processor, each would be incrementing its own count independently of each other. This would explain the -duplicate state values seen on some FlowFiles.

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

srijitachaturve · ‎03-20-2019

@Matt Clarke,@matt burgess Exactly the second point is happening, each node is generating its own value incrementing from last value it has stored in its local state. So which processor or method should i use to generate an incremental batchid (batch1,batch2...so on) since update attribute is messing values when running on cluster. or is there any property by which updateattribute processors on all nodes can pickup each others's last state variable?..please suggest

Cloudera Community

Support Questions

Are values of stateful variables in nifi cluster gets set to initial value again on nifi clusterrestart?

NiFi ETL: Removing columns, filtering rows, changi...

Attribute Value length Limit In Nifi

How do I Set Parameter Values in a NiFi instance u...

Apache Nifi: substract hours from column value wi...

Dynamic Initial Max Value on GenerateTableFetch

how to insert values in queue of nifi

Setting Up a Secure Apache NiFi Registry

The value of NiFi variables is not getting populat...

Re: Nifi Set Bind Variable

NIFI:How to get node value using variable node nam...