Support Questions

kiemnguyenxuan1 · ‎08-31-2017

I have a cluster Nifi with 3 nodes: node-1, node-2, node-3. When I run a job on cluster, there is some errors and node-2 disconnected to cluster. Then I want to go UI admin of node-1 or node-3 to stop this job. But I can not stop it.

It notices:
Cluster is unable to service request to change flow: Node node-2:8092 is currently disconnected

MattWho · ‎08-31-2017

@Kiem Nguyen

In a NiFi cluster, NiFi wants to make sure consistency across all nodes. You can't have each node in a NiFi cluster running a different version/state of the flow.xml.gz file. In a cluster, NiFi will replicate a request (such as stop x processor(s)) to all nodes. Since a node is not connected, that replication cannot occur. So to protect the integrity of the cluster, the NiFi canvas is essentially read-only while a node is disconnected.

Your two options are:

1. Reconnect the disconnected node and then stop your dataflow(s).

2. Drop the disconnected node form your cluster via the "cluster" UI found in the hamburger menu in the upper right corner of the UI. This will make your cluster a 2 of 2 cluster and will return UI to full functionality. You will need to then restart that dropped node in order to get it to try to join the cluster again once fixed.

Thanks,

Matt

View solution in original post

MattWho · ‎08-31-2017

@Kiem Nguyen

In a NiFi cluster, NiFi wants to make sure consistency across all nodes. You can't have each node in a NiFi cluster running a different version/state of the flow.xml.gz file. In a cluster, NiFi will replicate a request (such as stop x processor(s)) to all nodes. Since a node is not connected, that replication cannot occur. So to protect the integrity of the cluster, the NiFi canvas is essentially read-only while a node is disconnected.

Your two options are:

1. Reconnect the disconnected node and then stop your dataflow(s).

2. Drop the disconnected node form your cluster via the "cluster" UI found in the hamburger menu in the upper right corner of the UI. This will make your cluster a 2 of 2 cluster and will return UI to full functionality. You will need to then restart that dropped node in order to get it to try to join the cluster again once fixed.

Thanks,

Matt

kiemnguyenxuan1 · ‎09-01-2017

@Matt Clarke

Thanks your reply, I did follow the second option. But I had to remove data content on the disconnected node before restarting it.

And I found that the node disconnected because of overload queue when executing job.
I confuse that if we can configure queue size up to contain more data. How can we do this?

Please help me if you have solutions for these problems. (overload queue).

Thanks,

Kiem

MattWho · ‎09-01-2017

@Kiem Nguyen

I highly recommend starting a new question in Hortonworks community connection for this. Diagnosing what caused your node to disconnect and how to resolve is a different topic from how to stop a processor with a disconnected node.

It would also be helpful to explain what you mean by "overloaded queue" and what makes you feel the size of your queue triggered your node to disconnect. What error did you see in the nifi-app.log on the node that disconnected.

Thanks,

Matt

Cloudera Community

Support Questions

Can not stop processor in cluster when a node down/disconnect

Stop Nifi processor automatically

Publish_Kafka_1_0 processor not stops working afte...

Adding nodes to an HDP cluster

Backup Of Nifi Cluster with 3 Nodes

Stop all AWS instances

Multi Node Hadoop Cluster setup with Hbase and Zoo...

Load balancing in NiFi - Heterogenous Nodes in Clu...

How to set a processor to DEBUG when on Cloudera D...

Linux script to stop/start cluster services

Build Custom Nifi Processor