Support Questions

Find answers, ask questions, and share your expertise

Apache NIFI Restart Issue

avatar
New Contributor

Hi 

 

If We servers were restarted(NIFI Servers) , our processors seemed to be not running until we re – started  

Could you pls confirm if the flows need to be restarted each time an outage happens at  our end ?

5 REPLIES 5

avatar
Master Mentor

@vivek12 

 

If you NiFi dataflows are all coming up stopped after a NiFi restart, this indicates you have the following property in your nifi.properties file set to false:

nifi.flowcontroller.autoResumeState=

 

Make sure that this property is set to true so that all components return to last known state from before NiFi was last shutdown.

 

Hope this helps,

Matt

avatar
New Contributor

hi 

 

Thanks for your reply 

 

 I verified this option enable or not, it was true.

 

But still we are facing same issue after restart 

 

Regards

vivek

 

avatar
Master Mentor

@vivek12 

 

The last known state of the components is written to the flow.xml.gz file. Make sure the NiFi service user owns and proper permissions on this file.

Do you have multiple nodes in your NiFi cluster?  If so, make sure that property is set to true in every node.  All it takes is one node to have it false to cause issues.

What version of Apache NiFi are you running?

Thanks,

Matt

avatar
New Contributor
Hi sry for late reply 
 
Yes we have multiple nodes in cluster and we checked on each node it’s enabled as true 
 
We are using 1.9 version of NIfI 
 
We are facing this issue mostly with invokehttp processor 
 
 
Regards 
Bhuvan 

avatar
Master Mentor

@vivek12 

 

Your last statement is confusing to me.
Are you saying only the invokeHTTP processor is stopped after a NiFi restart?

 

This implies that someone stopped it or the flow.xml.gz is not getting updated with that processors last known state.  I'd inspect what is written in your flow.xml.gz on each node to make sure they all show the state as running.

The last known state of a processor is not something that is checked as node join a cluster.  Nodes joining are checked to make sure their flow.xml matches the elected cluster flow and if it does, the node can join.  That node will then be told to start those components which are running in the cluster.

When you restart the entire cluster each node presents its flow.  First node has Flow x and gets 1 vote, next node flow is checked and it matches exactly, that flow gets 2 votes.  Next node comes in and his flow is not exactly the same, so it gets 1 vote.  The flow with the most votes becomes the cluster flow. 
If you have an even number of nodes (for example 4 nodes) and 2 node's flows get a vote and other 2 node's get a different vote.  since you have 2 vs 2, NiFi will end up picking one at random.  My concern here is that some node(s) have last known state as stopped while another is running.  So sometimes with a complete restart of your cluster you end up starting the flow with the stopped state on this processor.
The other possibility is this invokeHTTP processor is failing validation on some node on startup and resulting in processor being stopped.

Have you tried copying the flow.xml.gz from one node to all the other nodes?

 

Hope this helps,

Matt