We have a NiFi – Kafka – Zookeeper clustered configuration (shown in figure) and we need to get emails through SMTP protocol, using the NiFi ListenSMTP process to send the incoming emails to Kafka.
We need to get the emails without generating duplicates.
Considering that we need to have only one entry point to the service, but having 2 NiFi in the cluster environment, we need to understand how to configure this in order to work the way we need.
Sounds like this should be running as an Isolated Processor and be configured to run on the Primary Node only instead of All Nodes.
Then, to take full advantage of both of the NiFi nodes you have, you'll want to create a Remote Processor Group back on yourself much like explained in https://community.hortonworks.com/articles/97773/how-to-retrieve-files-from-a-sftp-server-using-nif.....
Good luck and happy Flowfiling!
With the latest version of NiFi it is no longer necessary to use a Remote Process Group (RPG) to redistribute FlowFiles within the same NiFi cluster. A new FlowFile load balancer capability has been added to connections between processors. This new capability allows you to redistribute FlowFile when they land on a connection without needing to send them through a RPG. It also provides multiple strategies for load balancing.
It is not necessary to run a Listen based processor on Primary node only. The Listen based processor bind to a port and listens for incoming connections.
Your SMTP server would be configured to send emails to the a specific hostname for your ListenSMTP processor. Since this would result in all emails going to just one of your NiFi nodes, this may not be very desired. Typically users would setup an external Load Balancer (LB) in front of their NiFi cluster for the purposes of distributing emails to all NiFi node's ListenSMTP processors.
If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.