Can anyone help me in fixing the technical limitation on following use case.
I am using the data from hive through selecthive processor and output as csv. I am to read these flow file csv file as http based request. but i am failing implement the logic.
The sample uri looks as follow :
should download csv file
The rest-api call you are making will not download the content of a FlowFile.
A NiFi FlowFile consists of two parts:
1. FlowFile Attributes/metadata
2. FlowFile content
You are looking to download the content of a FlowFile.
NiFi does not track FlowFiles using the "filename". NiFi only tracks FlowFiles by there assigned uuid. So passing the filename "9b3c4f34-ad98-4835-9412-94228db3adab.0.csv" is not going to result in content being returned.
Using the NiFi rest-api to retrieve the content of a FlowFile from a connection queue is a multi step process.
Step 1: Make a queue listing request
curl -X POST `http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/listing-requests`
The json response will contain the listing request "id" which will be a uuid which will use in next rest-api call to get a listing of the FlowFIles in this queue.
Step 2: Get list of FlowFiles created by above request.
curl `http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/<listing request id>`
Above will return json with the list of FlowFiles in the connection queue found under "flowFileSummaries" which will include following attributes for each FlowFile:
--- "uri" rest-api uri for the flowfile
--- "uuid" the UUID for the FlowFile (also found in above uri)
--- "filename" filename attribute for flowfile associated to above UUID.
--- "position", "size", "queuedDuration", "lineageDuration", "penaltyExpiresIn", "clusterNodeId", "clusterNodeAddress", "penalized"
Step 3: Use the uri for the flowfile you want to retrieve the flowfile content from
curl `http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/flowfiles/<flowfile uuid>/content`
Performing above rest-api call without "/content" on the end will retrieve the FlowFile metadata instead of the content. You also may want to redirect the response to above rest-api call to a file on local disk using the "filename" attribute returned during step 2.
Also keep in mind that you can only download the content of a FlowFile from a queue if that FlowFile is still in the connection queue. If downstream processors are running and process that FlowFile from the connection queue, it will not be available any longer to download from that queue.
If you found this answer your query, please take a moment to accept the response.
Hope this helps,