Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Data is becoming stuck after Input Port in Nifi

avatar
Expert Contributor

We currently have a setup of 3 Nifis in a cluster. I have MiniFi on a sending server sending to the Input port in the screenshot called LoggerMiniFi2.

The data always gets stuck right after the Input Port. By stuck I mean it queues up and then just sits there, never empties. We cannot just drop the data, its important data so setting an expiration time doesn't make sense. You can see in the screenshot below under Status History that the data has sat in there with no bytes out for almost 15 minutes.

Can anyone provide any troubleshooting tips on this, perhaps a cause? Note, I have also tried removing the SplitText processor and just connecting the Input Port to the ExtractText and it still just stalled out right after the Input Port. I'm at a loss of what could be causing this.

Note, I have put the SplitText configs down below just for reference as well.

16556-screen-shot-2017-06-22-at-110921-am.png

16557-screen-shot-2017-06-22-at-110935-am.png

16558-screen-shot-2017-06-22-at-110943-am.png

16559-screen-shot-2017-06-22-at-111534-am.png

1 ACCEPTED SOLUTION

avatar
Master Guru

Your SplitText processor is scheduled to run on Primary Node only which doesn't seem right. MiNiFi would send data to all nodes.

Most likely the flow files that are sitting there are not on the primary node, which you can determine by doing a List Queue on that connection and looking at the host column on the right.

View solution in original post

6 REPLIES 6

avatar
Master Guru

Your SplitText processor is scheduled to run on Primary Node only which doesn't seem right. MiNiFi would send data to all nodes.

Most likely the flow files that are sitting there are not on the primary node, which you can determine by doing a List Queue on that connection and looking at the host column on the right.

avatar
Expert Contributor

But if I say "run on all nodes" doesn't that duplicate the data output ?? Or am I misunderstanding...

avatar
Master Guru

No, you have a 3 node cluster, lets say node #1 is primary node... MiNiFi is sending data to all nodes so the data is already divided across all the nodes, but you are only scheduled to process it on node #1, so now data on nodes #2 and #3 will just sit there and never get processed.

avatar
Expert Contributor

Awesome thanks again !

avatar
Expert Contributor

That was it! I don't understand the difference between setting it to Primary Node from All Nodes. I thought it would duplicate all the output data.

avatar
Master Guru

Generally you only want Primary Node only for a source processor like ListHDFS where you only want to perform the listing one time.