Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Force delete Process Group

avatar
Explorer

Hello everyone, 

 

I'm running a 11-nodes NiFi 1.15.3 cluster. One of the process groups is versioned on NiFi Registry and for some reason the local flowfile does not reflect the versioned configuration, so now the process group is stuck: I cannot do anything on it, not even moving it on the canvas because it always return an error:

 

Node XXXXXXXXX is unable to fulfill this request due to: [15, xxxxx-xxxxx-xxxxxx] is not the most up-to-date revision. This component appears to have been modified

 

The local configuration shows no changes, and nothing I tried so far worked (deleting the flow file, restarting the cluster node, etc). so I just want to delete the process group and deploy it again from the registry, but the web interface won't let me, throwing the same error.

 

Is there a way to force the deletion the process group?

 

Thanks

7 REPLIES 7

avatar
Super Mentor

@wasabipeas 

The revision is incremented anytime a change occurs on a component to make sure that all nodes are running the exact same dataflow. Revisions have nothing directly to do with version controlled dataflows. If you were to restart your entire cluster (not a rolling restart, but a shutdown all and start all nodes), component revisions will start over.  

"for some reason the local flowfile does not reflect the versioned configuration"
Are you saying that if you access the NiFi UI from a different node in your 11 node cluster, this process groups renders differently?

Screenshots would be helpful in understanding  your descriptions.
Does the process group indicate it is under version control?
Does it report "local changes"?

Revision issues can happen when a NiFi node is not running the same version as other nodes in the cluster.
Let's say some processor component you are using has a newer version on other nodes and the newer version of the processor introduced a new property. So on some nodes the property exists and on others it does not.  
I suggest verifying that all nodes ion your cluster are running the same version of NiFi. Additionally compare the contents of the NiFi lib directory(s) to make sure they are the same on all nodes.  This includes any custom lib directories or anything you may have added to the extensions directory.

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

avatar
Explorer

Hi Matt, 

 

Thank you for your help. I'lll try to give you more context about the issue.

 

Yes, all the nodes of the cluster run the same version: 1.15.3 and the same libs.

 

Unfortunately, shutting down the entire cluster is not an option because it is a production environment and it receives live streaming data we cannot afford to lose. 

 

We operate the environment this way:

 

  1. we create the process groups, flows and configurations on a staging cluster (3 nodes, same NiFi version 1.15.3).
  2. We commit the changes to the registry
  3. Then we pull them to the production cluster (11 nodes) from the registry.

This is why I'm sure that the configurations stored in the registry are valid and working: the same process group on the staging cluster has no issues.

 



Are you saying that if you access the NiFi UI from a different node in your 11 node cluster, this process groups renders differently?

No, the process group renders the same on all nodes, but every attempt to do anything on it results in the same error message:

 

wasabipeas_0-1663753199932.png

 

We cannot edit it, delete, change version, detach from version control, not even move it. It always returns the same error, albeit no modification is present locally.

 

In fact, the version control menu seems to "believe" that there are local changes to the PG:

 

wasabipeas_1-1663753475390.png

 

But then, if I select "Show local changes", nothing is shown, as expected:

 

wasabipeas_2-1663753565191.png

 

wasabipeas_3-1663753665132.png

 

Same if I select "Revert local changes":

wasabipeas_4-1663753706561.png

 

My assumption is that the process group definition in the local flow file is corrupted or not in sync with the version control of the registry. 

 

So I think that the best solution is to just force-delete the PG and the re-create it.

 

Is there a way to do this?

 

Thank you

 

S

 

avatar
Super Mentor

@wasabipeas 

What version of NiFi-Registry is being used as well?

In your NiFi UI, search for component UUID (a8db3982-1350-1b8b-ffff-fffff988699d).
What kind/type of component is it?  What is current state of the component (enabled, disabled, running, stopped, enabling, disabling, starting, stopping)
Share screenshot of its current configuration.

Thanks,

Matt

avatar
Explorer

@MattWho 

 

Hi Matt,

 

the registry is also version 1.15.3. Some more context: the cluster contains dozens of other process groups that have been managed in the same way and versioned on the same registry for months. This is the first time we experience this issue. 

 

The component with UUID a8db3982-1350-1b8b-ffff-fffff988699d is the "freezed" process group itself:

 

wasabipeas_0-1663857504788.png

 

This is its configuration:

 

wasabipeas_1-1663857867218.png

And its associated controller services:

 

wasabipeas_2-1663858157694.png

 

I still think that the fastest and safest solution is to delete and re-deploy. 

 

Is it possible?

 

S

avatar
Super Mentor

@wasabipeas 
I can think of no way to force delete if it is blocking on a revision mismatch between nodes. Nothing here has anything to do with version control. 

Is it always the same node reported in the pop-up message that fails to process the request?
If so, have you verified the libs and version running on that one node match rest of cluster?

If you go to the cluster UI and select "VERSIONS" tab, they all reflect same version?
You could manually disconnect the one node that it keeps complaining about from the "NODES" tab.
After it is disconnected, you could delete it from the cluster (Deleting the nodes does nothing flows or data on that node.  It will require a restart of that one node to get it to rejoin cluster).
Once the node is removed form your cluster (temporarily), your cluster should reflect 10/10 connected nodes now in the status bar of the canvas UI.

Check to see if your are still having revision issues with the process after reloading the page.
If all looks good, you could access the filesystem of the currently disconnected and deleted node, stop the NiFi service on that node, and delete/rename the flow.xml.gz and flow.json.gz files.  Then start this node again. On startup, NiFi will inherit the flow from the cluster and in doing so get the cluster flows current revision for the problematic process group. 

If problem persists, restart node that was deleted so that it rejoins the cluster.
Then disconnected the currently elected cluster coordinator.  A new cluster coordinator will then be elected by zookeeper. Check to see if issue with process group is resolved. Reload your browser to force a page refresh.

If issue is resolved, rejoin node to cluster via the cluster UI to see if issue returns.  If so, we at least know which is our problematic node.  You can of course, disconnect, delete, rename flow.xml.gz and flow.json.gz, and then restart node, just as we performed before so that flow is pulled from cluster on startup.  If issue still persists, there is something unique about this node.  Disk space ok?, any exceptions in logs?, while node may report same NiFi version, something different with contents of lib(s) folders (get a checksum and compare against other nodes).

Hope this helps without needing to restart entire cluster,

Matt

avatar
Super Guru

Is your dev cluster running the exact same version of NiFi as production, including the NiFi lib folder?

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Explorer

Hi, yes same version and libs.