Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Access the Nifi flow file of CSV output using http

Highlighted

Access the Nifi flow file of CSV output using http

New Contributor

Hi Team, 
     Can anyone help me in fixing the technical limitation on following use case.

I am using the data from hive through selecthive processor and output as csv. I am to read these flow file csv file as http based request. but i am failing implement the logic.

The sample uri looks as follow :

http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/flowfiles/9b3c4f34-ad98-4835-9412-94228db3adab.0.csv

 

should download csv file

1 REPLY 1

Re: Access the Nifi flow file of CSV output using http

Master Guru

@RAM17 

 

The rest-api call you are making will not download the content of a FlowFile.

A NiFi FlowFile consists of two parts:
1. FlowFile Attributes/metadata
2. FlowFile content

You are looking to download the content of a FlowFile.

NiFi does not track FlowFiles using the "filename".  NiFi only tracks FlowFiles by there assigned uuid.  So passing the filename "9b3c4f34-ad98-4835-9412-94228db3adab.0.csv" is not going to result in content being returned.

Using the NiFi rest-api to retrieve the content of a FlowFile from a connection queue is a multi step process.

Step 1:  Make a queue listing request

curl -X POST `http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/listing-requests`

The json response will contain the listing request "id" which will be a uuid which will use in next rest-api call to get a listing of the FlowFIles in this queue.

Step 2: Get list of FlowFiles created by above request.

curl `http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/<listing request id>`

Above will return json with the list of FlowFiles in the connection queue found under "flowFileSummaries" which will include following attributes for each FlowFile:
--- "uri" rest-api uri for the flowfile
--- "uuid" the UUID for the FlowFile (also found in above uri)
--- "filename" filename attribute for flowfile  associated to above UUID.
--- "position", "size", "queuedDuration", "lineageDuration", "penaltyExpiresIn", "clusterNodeId", "clusterNodeAddress", "penalized"

 

Step 3: Use the uri for the flowfile you want to retrieve the flowfile content from

curl `http://localhost:8080/nifi-api/flowfile-queues/2dc8fdee-0170-1000-768a-ea14bc96a9cd/flowfiles/<flowfile uuid>/content`

Performing above rest-api call without "/content" on the end will retrieve the FlowFile metadata instead of the content.  You also may want to redirect the response to above rest-api call to a file on local disk using the "filename" attribute returned during step 2.

 

Also keep in mind that you can only download the content of a FlowFile from a queue if that FlowFile is still in the connection queue.  If downstream processors are running and process that FlowFile from the connection queue, it will not be available any longer to download from that queue.

 

If you found this answer your query, please take a moment to accept the response.

 

Hope this helps,

Matt

 

Don't have an account?
Coming from Hortonworks? Activate your account here