Support Questions

Find answers, ask questions, and share your expertise

Apache Nifi: How to get all data csv in folder API with invokeHTTP processor

avatar
New Contributor

Example : I have data in http://20.30.06.08/files/test/
and from the API's have a many csv data like : 
Name                                lastmodified              size 
test-20240625.csv    2024-06-26 08:14      3.9K
test-20240626.csv    2024-06-27 08:14      2.9K
test-20240627.csv    2024-06-28 08:14      1.9K

how to get all data csv from folder test, with apache nifi (invokedHTTP processor)


2 REPLIES 2

avatar
Community Manager

@greenflag, Welcome to our community! To help you get the best possible answer, I have tagged in our NiFi experts @MattWho @SAMSAL  who may be able to assist you further.

Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Master Mentor

@greenflag 

Not knowing anything about this rest-api endpoint, all I have are questions.

How would you complete this task outside of NiFi?
How would you accomplish this using curl from command line?

What do the REST-API docs for your endpoint have in terms of how to get files? 
Do they expect you to pass the filename in the rest-api request?
What is the rest-api endpoint that would return the list of files?

My initial thought here (with making numerous assumptions about your endpoint) is that you would need multiple InvokeHTTP processors possibly.  
The first InvokeHTTP in the dataflow hits the rest-api endpoint that outputs the list of files in the endpoint directory which would end up in the content of the FlowFile.  Then you split that FlowFile by its content so you have multiple FlowFiles (1 per each listed file).  Then rename each FlowFile using the unique filename and finally pass each to another invokeHTTP processor that actually fetches that specific file.

Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt