- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
ListenUDP Fault tolerance
- Labels:
-
Apache NiFi
Created 12-25-2023 02:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all, I have built an Apache nifi cluster consisting of 5 nodes and a separate zookeeper server. After that I login to webui and use the ListenUDP processor. This processor opens for example 50000 port on all nodes and it can receive information on any node. For example I will send information to host1:50000, then it should distribute the information on all nodes (host2, host3, host4, host5). It does all this and everything works, but I have two questions.
1) Does host1 participate in sending logs further down the processors or does it just receive and distribute?
2) The Nifi cluster is called fault tolerant, but what if host1 becomes unavailable? Essentially it will not be able to receive information and distribute it to nodes. Is there any solution for this? So that when host1 goes down, the information can be transferred to host2?
Created 12-27-2023 12:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- How the FlowFile is distributed from your ListenUDP processor to the next in the flow is defined in the connection between them.
- Leveraging something like HAProxy, Nginx, or any other form of load balancer in front of your NiFi cluster would be a way to ensure you data is forwarded to any of the nodes that are still accessible as long as the cluster is up.
Created 01-02-2024 06:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@benimaru
It is important to understand that NiFi does not replicate active FlowFiles (objects queued in connection between NiFi processor components) across multiple nodes. So in a five node NiFi cluster where you are load balancing FlowFiles across all nodes, each node has a unique subset of the full data received. This if node 1 goes down, the FlowFiles on node 1 will not be processed until node 1 is back up.
100% agree with @joseomjr that placing an external load balancer in front of the ListenUDP endpoint is the correct solution to ensure high availability of that endpoint across all your NiFi nodes.
If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 12-26-2023 10:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@benimaru Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @SAMSAL @joseomjr who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 12-27-2023 12:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- How the FlowFile is distributed from your ListenUDP processor to the next in the flow is defined in the connection between them.
- Leveraging something like HAProxy, Nginx, or any other form of load balancer in front of your NiFi cluster would be a way to ensure you data is forwarded to any of the nodes that are still accessible as long as the cluster is up.
Created 01-02-2024 06:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@benimaru
It is important to understand that NiFi does not replicate active FlowFiles (objects queued in connection between NiFi processor components) across multiple nodes. So in a five node NiFi cluster where you are load balancing FlowFiles across all nodes, each node has a unique subset of the full data received. This if node 1 goes down, the FlowFiles on node 1 will not be processed until node 1 is back up.
100% agree with @joseomjr that placing an external load balancer in front of the ListenUDP endpoint is the correct solution to ensure high availability of that endpoint across all your NiFi nodes.
If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
