Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How can i use invokeHTTP to download multiple .zips from different web pages dynamically in NiFi?

Highlighted

How can i use invokeHTTP to download multiple .zips from different web pages dynamically in NiFi?

New Contributor

I have multiple web pages that contain the files (.zips) I need to download and archive. I need to be able to find the names of the files per each web page and download them to a specific directory. I have tried using

GenerateFlowFile: (Contains all urls i need data from) -> SplitText (one url per line)

->ExtractText -> InvokeHTTP -> this gives me the response with filenames i need listed in <a href=**>.

How would I extract the names of the files I need from the response, and then download the files?

Regards,

1 REPLY 1
Highlighted

Re: How can i use invokeHTTP to download multiple .zips from different web pages dynamically in NiFi?

@Dakota M I recently came across an article where they are parsing html pages for images. Perhaps this helps you as well. Article here

Don't have an account?
Coming from Hortonworks? Activate your account here