Support Questions

darllen · ‎01-19-2017

Imagine that I have a NiFi 1.x cluster (masterless) spanning multiple datacenters with processors pulling data over HTTP from external servers. If I were concerned that one particular datacenter were unable to reach the data source (think firewall misconfiguration or datacenter being blocked by the data source), how would I go about ensuring that the processor was migrated to a location where it would be able to successfully connect to the data source?

scenario illustration:

DC_A |----------| <YES, reaches data source>

| |

DC_B | cluster | <NO, can't reach data source>

| |

DC_C |----------| <YES, can't reach data source>

If the GetHTTP processor were running on DC_B, how could I 1) detect that I'm getting failures from the DC_B instances <easy> 2) migrate the processor to run in another datacenter?

MattWho · ‎01-19-2017

@David Arllen

You have an interesting scenario here.

NiFi's design with this regard is with the expectation that when you build a dataflow, any node in the cluster is expected to be able to run that dataflow. This ensure true failover capability specific to NIfi node failure. In your case, i am understanding you looking for failover at a dataflow level.

So GetHTTP is configured to run "on primary node" only. Lets say DC_B is your current primary node and everything is working fine. At some point the GetHTTP starts failing to connect to the source. What you want is NiFi to detect this and switch the primary node designation to another node I the cluster in hopes that the GetHTTP processor will still be successful there.

Two things immediately come to mind:

1. A dataflow may have a number of processors in different dataflows all using the "on primary node" execution. If NiFi were to switch primary node because just one of those processors was failing, it would switch for all processors.

2. What if the source was truly down and none of your nodes could connect to it. NiFi would continue to switch from node to node looking to eventually hit a node that can once again connect? This would have an affect on the other primary node flows that are working.

Or am i off and you are not talking about a NiFi cluster that spans multiple data centers, but rather 3 data centers all running standalone NiFi instances? If this is the case you may be able to have a process monitor the NiFi app log for getHTTP failures and use curl to stop the GetHTTP processor on DC_B and call it to start on DC_C. You however lose your centralized management of you dataflow this way.

Bottom line is that this feature does not exist in NiFi currrently.

Matt

View solution in original post

MattWho · ‎01-19-2017