Support Questions

Find answers, ask questions, and share your expertise

Does NiFi's DistributeLoad (or similar) have any capability to distribute to the same cluster nodes by attribute/key?

avatar

I need to route specific "streams" of flowfiles to specific cluster nodes by FlowFile attribute value. Is there a way to take an attribute and ensure that all incoming FlowFiles with the same value for an attribute get distributed to the same cluster node?

1 ACCEPTED SOLUTION

avatar
Rising Star

The following is under the assumption you do not explicitly require using Site to Site to transmit the data. While there are facilities that could support this, it is not an extension point and currently does not provide control over how FlowFiles are delivered.

The simplest and most naive approach would be to use RouteOnAttribute to route each FlowFile to a given relationship and then use that to feed your transmission processor of choice. In this case, a PostHTTP sending to a ListenHTTP would be one way of attack that would allow transmission formatted as FlowFiles. Depending on the source system you might, such as if it was clustered, need to use expression language to additionally mark the destination system and use that to dynamically craft the resultant POST URL to the associated listener. This is fairly static and simple but would cover the use case you were anticipating.

View solution in original post

1 REPLY 1

avatar
Rising Star

The following is under the assumption you do not explicitly require using Site to Site to transmit the data. While there are facilities that could support this, it is not an extension point and currently does not provide control over how FlowFiles are delivered.

The simplest and most naive approach would be to use RouteOnAttribute to route each FlowFile to a given relationship and then use that to feed your transmission processor of choice. In this case, a PostHTTP sending to a ListenHTTP would be one way of attack that would allow transmission formatted as FlowFiles. Depending on the source system you might, such as if it was clustered, need to use expression language to additionally mark the destination system and use that to dynamically craft the resultant POST URL to the associated listener. This is fairly static and simple but would cover the use case you were anticipating.