Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Fetching file using FetchAzureBlobStorage Nifi 1.7

Solved Go to solution

Fetching file using FetchAzureBlobStorage Nifi 1.7

New Contributor

Hi,

I've N number of files in an Azure blob called X and it has a container inside it called Y. I'm trying to fetch a specific file from the Blob container Y .

I'm using ListAzureBlobStorage->FetchAzureBlobStorage for this purpose and in the FetchAzureBlobStorage I'm using the filename for the property "Blob".

It fetches all the files in the container , can anybody please tell me where Iam going wrong , or is it how the processor works or is there any other way to do it

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Fetching file using FetchAzureBlobStorage Nifi 1.7

Master Guru
@narasimha chembolu

-

The ListAzureBlobStorage processor is designed to produce a FlowFile for each blob listed from target storage. For each produced FlowFile produced the following attributes are written to the FlowFile:

93215-screen-shot-2018-11-07-at-115418-am.png

-

The FetchAzureBlobStorage processor is triggered to execute by each incoming FlowFile that it receives. It is designed by default to use the value that was assigned to that FlowFile attribute "azure.blobname" by the listAzureBlobStorage to determine which blob to return.

-

In you case you are only looking to actually fetch a very specific blob, so you configured "Blob' property in the FetchAzure processor to always get a very specific blob. This means that every incoming FlowFile is going to fetch the content of the same blob each time an insert it in to the content of every listed FlowFile.

-

So your flow is working as designed, but not as you intended.

-

You have two options:

1. reconfigure your FetchAzureBlobStorage processor to use "${azure.blobname}" in the blob property. Then add a routeOnAttribute processor between the listAzureBlobStorage and FetchAzureBlobStorage processor to filter on the specific blob name you are looking for so that only that listed file makes it to the FetchAzure processor.

-

2. Don't use ListAzureStorage processor at all. Instead use a GenerateFlowFile processor to generate single 0 byte FlowFile on the primary node and use it to trigger the FetchAzureStorage processor to fetch the specific blob you want.

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

1 REPLY 1

Re: Fetching file using FetchAzureBlobStorage Nifi 1.7

Master Guru
@narasimha chembolu

-

The ListAzureBlobStorage processor is designed to produce a FlowFile for each blob listed from target storage. For each produced FlowFile produced the following attributes are written to the FlowFile:

93215-screen-shot-2018-11-07-at-115418-am.png

-

The FetchAzureBlobStorage processor is triggered to execute by each incoming FlowFile that it receives. It is designed by default to use the value that was assigned to that FlowFile attribute "azure.blobname" by the listAzureBlobStorage to determine which blob to return.

-

In you case you are only looking to actually fetch a very specific blob, so you configured "Blob' property in the FetchAzure processor to always get a very specific blob. This means that every incoming FlowFile is going to fetch the content of the same blob each time an insert it in to the content of every listed FlowFile.

-

So your flow is working as designed, but not as you intended.

-

You have two options:

1. reconfigure your FetchAzureBlobStorage processor to use "${azure.blobname}" in the blob property. Then add a routeOnAttribute processor between the listAzureBlobStorage and FetchAzureBlobStorage processor to filter on the specific blob name you are looking for so that only that listed file makes it to the FetchAzure processor.

-

2. Don't use ListAzureStorage processor at all. Instead use a GenerateFlowFile processor to generate single 0 byte FlowFile on the primary node and use it to trigger the FetchAzureStorage processor to fetch the specific blob you want.

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

Don't have an account?
Coming from Hortonworks? Activate your account here