- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Apache Nifi: How to get all data csv in folder API with invokeHTTP processor
- Labels:
-
Apache NiFi
Created on 06-27-2024 09:20 PM - edited 06-28-2024 12:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Example : I have data in http://20.30.06.08/files/test/
and from the API's have a many csv data like :
Name lastmodified size
test-20240625.csv 2024-06-26 08:14 3.9K
test-20240626.csv 2024-06-27 08:14 2.9K
test-20240627.csv 2024-06-28 08:14 1.9K
how to get all data csv from folder test, with apache nifi (invokedHTTP processor)
Created 06-27-2024 11:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@greenflag, Welcome to our community! To help you get the best possible answer, I have tagged in our NiFi experts @MattWho @SAMSAL who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 07-02-2024 06:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@greenflag
Not knowing anything about this rest-api endpoint, all I have are questions.
How would you complete this task outside of NiFi?
How would you accomplish this using curl from command line?
What do the REST-API docs for your endpoint have in terms of how to get files?
Do they expect you to pass the filename in the rest-api request?
What is the rest-api endpoint that would return the list of files?
My initial thought here (with making numerous assumptions about your endpoint) is that you would need multiple InvokeHTTP processors possibly.
The first InvokeHTTP in the dataflow hits the rest-api endpoint that outputs the list of files in the endpoint directory which would end up in the content of the FlowFile. Then you split that FlowFile by its content so you have multiple FlowFiles (1 per each listed file). Then rename each FlowFile using the unique filename and finally pass each to another invokeHTTP processor that actually fetches that specific file.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
