Created 04-25-2022 05:40 AM
Hi. I haven't found any documentation for these internals about the InvokeHTTP processor in NiFi. My requirement is to throttle requests to and endpoint (which we also control) that will accept, let's say, 5 simultaneous connections in an address and port, but I need to wait for each of their responses before letting another flowfile to be sent, so:
This all boils down to not knowing exactly how this processor work under the hood. Any insight would be much appreciated, thanks in advance!
Created on 04-26-2022 06:24 AM - edited 04-26-2022 06:28 AM
@jonay__reyes I think by default you will see the result you are expecting, however, the expected limit of 5 concurrent connections may be a challenge. Let's address your questions first:
For concurrent tasks and run schedule adjustments, you should always experiment in small increments, changing one setting at a time, evaluating, and repeating until you find the right balance. I suspect that you will not need 5 long executing request/responses in parallel, and that even with default settings, your queued flowfiles will execute fast enough to appear "simultaneous".
Created on 04-26-2022 06:24 AM - edited 04-26-2022 06:28 AM
@jonay__reyes I think by default you will see the result you are expecting, however, the expected limit of 5 concurrent connections may be a challenge. Let's address your questions first:
For concurrent tasks and run schedule adjustments, you should always experiment in small increments, changing one setting at a time, evaluating, and repeating until you find the right balance. I suspect that you will not need 5 long executing request/responses in parallel, and that even with default settings, your queued flowfiles will execute fast enough to appear "simultaneous".
Created 04-26-2022 07:01 AM
Thanks for your replies @steven-matison!!! great help indeed.
Anyway, a little more detail on the 3rd point please: Run Schedule sets how long a process will operate before a new instance is necessary
I have this setting in other processors for marking the frequency of "hey, start working!" for each processor. In this InvokeHTTP case, having the specific configuration of "5 concurrent tasks" and "run schedule 0.2 sec", my doubt is: will it wait 0.2s between request and request, on each thread?
Since you just explained that >1 task means "don't even wait", I guess that this setting will make the processor to send away 5 requests every 0.2s, so 25 requests per second, without even caring about if the remote server replied or not. Is this the case?
Otherwise, if concurrent tasks were set to 1, then the processor would wait for a response, THEN wait 0.2s, and THEN send away the next one?
My confusion comes from your "how long a process will operate" versus "how long will it wait after processing the previous flowfile.
Thanks again!
Created 04-28-2022 05:01 AM
Do not think of the existence number of processors (concurrency) and the run schedule for that process as relating to request/response timing. The request/response time could be almost instant, to as long as your other end takes to respond specifically in reference to InvokeHttp. The number of processors (concurrency) is used to help gain a higher number of unique instances running against that proccessor maybe and usallly to help drain a huge queue of flowfiles (1000s,10000s,1000000s,etc). Run schedule is how long that one instance stays active (able to process more than 1 flowfile in sequence).
Hope this helps,
Steven
Created 01-12-2023 11:52 PM
Hi I have the same problem. I want to process 5 flow files at a time. Send the next one only if 1/5 gets a response using invokeHttp, can someone send config?
Created 01-13-2023 04:12 AM
Even though you could send 5 at a time, you cannot wait for any of them (e.g., sequentilally) for allowing the next batch to be sent, at least that I know, using only this processor. I'd play with the idea of routing any response of this processor (retry, fail, success?) to a RouteOnAttribute processor that evaluates a flag for governing the InvokeHTTP, or better yet, use the Wait/Notify processors as explained in https://community.cloudera.com/t5/Support-Questions/Retrieve-Value-of-Signal-Counter-in-Wait-Notify-...