Created 12-25-2023 02:26 AM
Hi all, I have built an Apache nifi cluster consisting of 5 nodes and a separate zookeeper server. After that I login to webui and use the ListenUDP processor. This processor opens for example 50000 port on all nodes and it can receive information on any node. For example I will send information to host1:50000, then it should distribute the information on all nodes (host2, host3, host4, host5). It does all this and everything works, but I have two questions.
1) Does host1 participate in sending logs further down the processors or does it just receive and distribute?
2) The Nifi cluster is called fault tolerant, but what if host1 becomes unavailable? Essentially it will not be able to receive information and distribute it to nodes. Is there any solution for this? So that when host1 goes down, the information can be transferred to host2?
Created 12-27-2023 12:14 PM
Created 01-02-2024 06:33 AM
@benimaru
It is important to understand that NiFi does not replicate active FlowFiles (objects queued in connection between NiFi processor components) across multiple nodes. So in a five node NiFi cluster where you are load balancing FlowFiles across all nodes, each node has a unique subset of the full data received. This if node 1 goes down, the FlowFiles on node 1 will not be processed until node 1 is back up.
100% agree with @joseomjr that placing an external load balancer in front of the ListenUDP endpoint is the correct solution to ensure high availability of that endpoint across all your NiFi nodes.
If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 12-26-2023 10:51 AM
@benimaru Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @SAMSAL @joseomjr who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 12-27-2023 12:14 PM
Created 01-02-2024 06:33 AM
@benimaru
It is important to understand that NiFi does not replicate active FlowFiles (objects queued in connection between NiFi processor components) across multiple nodes. So in a five node NiFi cluster where you are load balancing FlowFiles across all nodes, each node has a unique subset of the full data received. This if node 1 goes down, the FlowFiles on node 1 will not be processed until node 1 is back up.
100% agree with @joseomjr that placing an external load balancer in front of the ListenUDP endpoint is the correct solution to ensure high availability of that endpoint across all your NiFi nodes.
If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt