Support Questions

Find answers, ask questions, and share your expertise

Find Processors with Stored State

avatar
Contributor

Good afternoon!  In planning out how we reset an environment when we have a refresh of one of our source systems, I need to ensure that any stateful processors (like QueryDatabaseTableRecord) have their state cleared so that they pick up all new records.

 

Is there a way to find all processors storing state?  Or, even better, to just destroy the state of all processors on the NiFi instance?

 

Thanks.

2 ACCEPTED SOLUTIONS

avatar
Master Mentor

@kellerj 
Depending on your setup, state may be stored locally, in Zookeeper, or a mix of both.

With a standalone non NiFi cluster setup, all state will be store in the local NiFi state directory.
With a NiFi cluster setup (even if cluster only consists of one node), some processors will store state in a mix of local via file (state specific to a single node) and cluster via zookeeper (state that needs to be shared amongst all nodes in a NiFi cluster).

The NiFi configuration file "state-management.xml" defines where both the local state and cluster state is being stored. 

You can clear all local state by simply emptying the contents of the locally configured state directory on every node.  The "Directory" is defined in the "local-provider" within the state-management.xml file

You can clear the Cluster state, by clearing out the znode on zookeeper defined in the "cluster-provider" within the state-management.xml file.

NiFi must be shutdown before performing clearing out either local or cluster state.

 

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

View solution in original post

avatar
Master Mentor

@kellerj 
The alternative is making rest-api calls or navigating to every processor that writes state and purge it manually.  You would need to keep track of all your current and newly added processors that store state in order to accomplish, but shutting down your NiFi is not needed to clear state this way.  🙂 

As far as finding all processors that store state, that is challenge in itself.  The embedded documentation for every processor will have a "State Management:" section which will tell you if the component stores state (Note that processors that show only "Cluster" state will store that state locally if it is a non cluster configured standalone instance of NiFi").

Once you have identified all the components your operations team is using by their unique UUID and filtered only those that write state, you can use that info to clear state using NiFi's rest-api, which is a multi-request process.  Then you need to worry about your operations team adding additional state based components later or removing and re-adding an existing state based processor triggering a new UUID. So while possible, it is challenging to orchestrate and maintain.

You can also right click on a component to view its state and then purge that state directly from the listed state within NiFi UI.

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@kellerj 
Depending on your setup, state may be stored locally, in Zookeeper, or a mix of both.

With a standalone non NiFi cluster setup, all state will be store in the local NiFi state directory.
With a NiFi cluster setup (even if cluster only consists of one node), some processors will store state in a mix of local via file (state specific to a single node) and cluster via zookeeper (state that needs to be shared amongst all nodes in a NiFi cluster).

The NiFi configuration file "state-management.xml" defines where both the local state and cluster state is being stored. 

You can clear all local state by simply emptying the contents of the locally configured state directory on every node.  The "Directory" is defined in the "local-provider" within the state-management.xml file

You can clear the Cluster state, by clearing out the znode on zookeeper defined in the "cluster-provider" within the state-management.xml file.

NiFi must be shutdown before performing clearing out either local or cluster state.

 

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

avatar
Contributor

Thanks for the response.  TL;DR: It's complicated. 😄 

I don't think my operations team will be a fan of shutting things down and purging files and/or zookeeper nodes.  But, this is good to know for when we have an important reason to purge state.

 

avatar
Master Mentor

@kellerj 
The alternative is making rest-api calls or navigating to every processor that writes state and purge it manually.  You would need to keep track of all your current and newly added processors that store state in order to accomplish, but shutting down your NiFi is not needed to clear state this way.  🙂 

As far as finding all processors that store state, that is challenge in itself.  The embedded documentation for every processor will have a "State Management:" section which will tell you if the component stores state (Note that processors that show only "Cluster" state will store that state locally if it is a non cluster configured standalone instance of NiFi").

Once you have identified all the components your operations team is using by their unique UUID and filtered only those that write state, you can use that info to clear state using NiFi's rest-api, which is a multi-request process.  Then you need to worry about your operations team adding additional state based components later or removing and re-adding an existing state based processor triggering a new UUID. So while possible, it is challenging to orchestrate and maintain.

You can also right click on a component to view its state and then purge that state directly from the listed state within NiFi UI.

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

avatar
Contributor

Yep!  Had moved on to that after your response and written up a JIRA for my team to build the needed tool.  Thanks again for the response.  My plan is the below

 

> Find all the processors which may have state, then check if each has state, and if so, clear it.

 

  1. Find processors of used types which can contain state: e.g., QueryDatabaseTableRecord
  2. For each, pull the state by the ID and check if it has any stored state
    1. if totalEntryCount of cluster or local state is non-zero
  3. If so, call the clear state endpoint

Query URL: /nifi-api/flow/search-results?q=QueryDatabaseTableRecord

Pull Processor State: /nifi-api/processors/41584e33-adc6-171d-0000-0000581caccb/state

Clear Processor State: POST /nifi-api/processors/41584e33-adc6-171d-0000-0000581caccb/state/clear-requests