Created 10-04-2017 06:45 AM
I have a Nifi cluster with 3 nodes. Curently I'm using ListenWebsocket processor with JettyWebSocketServer listen on 9001 port to get JSON data from multiple client.
I research about load balance for ListenXXX on Nifi and almost post suggest using a HA proxy between client and Nifi cluster.
I also worked with ListXXX/FetchXXX and it can use Remote Process Group with setting primary node on List/Fetch processor.
So I confuse that we can use Remote Process Group for ListenXXX as ListenWebsocket to load balancing cluster.
Client can send data to one node of Nifi cluster then that node will forward data to Remote Process Group to load balance all node.
If there is any duplicate or losing data?
Created 10-08-2017 02:03 PM
It depends on what part you want to load balance. If it's data reception, you need to use load balancer since NiFi is only listening. If it's data processing, then you can receive data on one node and use S2S and RPG to distribute load on other nodes and do the processing of this data on the whole cluster.
Note that in this case you have no High Availability for data reception. Your clients are configured with the address on one node, so if this goes down they won't be able to get data into NiFi and you can loose data. That's another benefit of having a load balancer.
I hope this helps.
Created 10-08-2017 02:03 PM
It depends on what part you want to load balance. If it's data reception, you need to use load balancer since NiFi is only listening. If it's data processing, then you can receive data on one node and use S2S and RPG to distribute load on other nodes and do the processing of this data on the whole cluster.
Note that in this case you have no High Availability for data reception. Your clients are configured with the address on one node, so if this goes down they won't be able to get data into NiFi and you can loose data. That's another benefit of having a load balancer.
I hope this helps.
Created 10-13-2017 09:01 AM
Thanks for your detail @Abdelkrim Hadjidj
After researching, I also have a summary as yours. We have 2 normal way to load balance
Firstly, using a HA proxy between Nifi cluster and client
In other hand, we can use one node for reception then forward data to S2S and RPG to distribute.