Created on 06-22-2017 03:20 PM - edited 08-17-2019 06:46 PM
We currently have a setup of 3 Nifis in a cluster. I have MiniFi on a sending server sending to the Input port in the screenshot called LoggerMiniFi2.
The data always gets stuck right after the Input Port. By stuck I mean it queues up and then just sits there, never empties. We cannot just drop the data, its important data so setting an expiration time doesn't make sense. You can see in the screenshot below under Status History that the data has sat in there with no bytes out for almost 15 minutes.
Can anyone provide any troubleshooting tips on this, perhaps a cause? Note, I have also tried removing the SplitText processor and just connecting the Input Port to the ExtractText and it still just stalled out right after the Input Port. I'm at a loss of what could be causing this.
Note, I have put the SplitText configs down below just for reference as well.
Created 06-22-2017 04:56 PM
Your SplitText processor is scheduled to run on Primary Node only which doesn't seem right. MiNiFi would send data to all nodes.
Most likely the flow files that are sitting there are not on the primary node, which you can determine by doing a List Queue on that connection and looking at the host column on the right.
Created 06-22-2017 04:56 PM
Your SplitText processor is scheduled to run on Primary Node only which doesn't seem right. MiNiFi would send data to all nodes.
Most likely the flow files that are sitting there are not on the primary node, which you can determine by doing a List Queue on that connection and looking at the host column on the right.
Created 06-22-2017 05:01 PM
But if I say "run on all nodes" doesn't that duplicate the data output ?? Or am I misunderstanding...
Created 06-22-2017 05:04 PM
No, you have a 3 node cluster, lets say node #1 is primary node... MiNiFi is sending data to all nodes so the data is already divided across all the nodes, but you are only scheduled to process it on node #1, so now data on nodes #2 and #3 will just sit there and never get processed.
Created 06-22-2017 05:26 PM
Awesome thanks again !
Created 06-22-2017 05:04 PM
That was it! I don't understand the difference between setting it to Primary Node from All Nodes. I thought it would duplicate all the output data.
Created 06-22-2017 05:05 PM
Generally you only want Primary Node only for a source processor like ListHDFS where you only want to perform the listing one time.