Support Questions

Find answers, ask questions, and share your expertise

What is the best way to iterate a list of urls so that each triggers InvokeHttp processor

avatar
Guru

I would like to configure a list of urls so that my nifi InvokeHttp processor iterates sequentially over them by a schedule (e.g. once per day). Instead of hard-coding 'n' number of InvokeHttp processors each with their own url, I would like to have a local configuration file (or nifi artifact) list multiple urls and iterate these to trigger in sequence one generic InvokeHttps processor instance for each url. What is the best way to do this?

1 ACCEPTED SOLUTION

avatar
Rising Star

Your best strategy for this case would be to leverage Expression Language [1] in the Remote URL field of your InvokeHTTP. You could then use a GetFile to take your set of URLs from a file on the filesystem or even an InovkeHTTP to another location to introduce your list of URLs scheduling strategy of Cron to pull in that list as desired.

This list could then be split into separate events using a SplitText. From here, we can promote the contents of each of these splits to an attribute (perhaps target.url) using ExtractText. This would then be passed to your InvokeHTTP which would make use of the target.url attribute by specifying ${target.url} in the "Remote URL" field mentioned above.

[1]https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

View solution in original post

3 REPLIES 3

avatar
Rising Star

Your best strategy for this case would be to leverage Expression Language [1] in the Remote URL field of your InvokeHTTP. You could then use a GetFile to take your set of URLs from a file on the filesystem or even an InovkeHTTP to another location to introduce your list of URLs scheduling strategy of Cron to pull in that list as desired.

This list could then be split into separate events using a SplitText. From here, we can promote the contents of each of these splits to an attribute (perhaps target.url) using ExtractText. This would then be passed to your InvokeHTTP which would make use of the target.url attribute by specifying ${target.url} in the "Remote URL" field mentioned above.

[1]https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

avatar
Master Guru

You could use GetFile -> SplitText -> ExtractText -> InvokeHttp:

  1. GetFile gets the configuration file, set "Keep source file" to true and schedule it to run once a day
  2. SplitText splits the file into multiple flow files, each containing a single line/URL
  3. ExtractText can put the contents of the flow file into an attribute (called "my.url" for example)
  4. InvokeHttp can be configured to use an Expression Language construct for the URL property (such as "${my.url}")

avatar
Explorer

Is this (now) considered a NiFi "anti-pattern"? Do you have any idea how to do this using NiFi Record serialization services? I'm under the impression that creating thousands of content files is not the best practice by today's standards, but I'm not sure how to use InvokeHTTP on a full set of records without splitting it into many flowfiles. Any ideas?