Member since
06-03-2021
3
Posts
0
Kudos Received
0
Solutions
06-04-2021
01:44 PM
@khaldoune Some components that maintain state do so because they were developed with intent of being used in NiFi cluster setup to support non cluster friendly protocols. Example (getting data from SFTP server): In Standalone NiFi you would use the GetSFTP processor (does not record state). In Cluster NiFi you would use the ListSFTP (records state) and FetchSFTP processor to do the same task. The ListSFTP processor would be configured to execute on "primary node" only. That way you do not have every node in your cluster trying to list the same files on yoru target SFTP processor. Then the success from listSFTP which simply has FlowFiles with no content and only metadata/attributes is connected to a FetchSFTP processor. That connection between those two processors would be configured to load balance those FlowFiles to all nodes. Now the heavy work of ingesting the actual content for each of those listed FlowFiles is spread across all nodes in the cluster. Even if you use above processors in a standalone, they will still record state. Cluster state is generally stored to help when a primary node change occurs. That way the newly elected primary node that now starts executing the primary node only configured processors, will have those processor fetch that last known state from ZK so that it does not list the same files already listed by previous primary node. Just some more context for you on how state is used primarily by components and why. Matt
... View more