I am sure this was originally done for performance and NiFi JVM heap usage reasons. The first 100 returned should be the oldest 100 in queue (keeping in mind that a connection will also show count of FlowFiles pending processing by downstream processor and count of those currently allocated to a downstream component process. The listing only returns those pending FlowFiles and not those already owned by downstream component).What is the use case for needing to list more? Ideally what is found in a queue should be changing rapidly, so expectation is that each listing request would be different. Listing a queue does not stop NiFi processing. The intent is not for NiFi to ever hold FlowFiles in any connection. So using API to poll connection for FlowFile listings seems odd to me. What is returned by that listing could be inaccurate milliseconds later.
Also be careful with your API requests. When a listing is performed through the browser three different request are made.
1. First listing-request is made and replicated to all nodes to get result sets. 2. Return from step 1 request gives the ID for the generated listing request being held in heap memory. That ID is used to fetch the results in that specific listing ID 3. A DELETE request is made to remove the listing with that ID from NiFi.
*** When using API, If steps 1 and 2 are all that are being executed, the various listing request(s) will stay in heap memory.
If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.