Support Questions

Find answers, ask questions, and share your expertise

UnpackContent error "Unable to unpack FlowFile because it does not appear to have any entries" on a simple WinZip file

avatar
Explorer

Hello,

I am attempting to unzip a series of simple .zip files in my NiFi flow. All of my flow executes with no issues except when I try to unpack. There is nothing special about these zip files (i.e. no passwords, etc...). But no matter which way I try to unpack the zip files, I get the error that the zip file does not contain any entries. I have verified that the zip files do contain entries. I can easily unzip with WinZip. I can also easily unzip with our old custom C# application that we are replacing with NiFi.

I have tried adjusting the UnpackContent settings and even have added an UpdateAttribute to add the application/zip mime type as well. Still keep getting the failure. I have also tried the CompressContent processor and still get these errors on simple WinZip files.

I am attaching my flow, my processor settings and error message.

 


Unpack3.jpgUnpack2.jpgUnpack1.jpgUnpack4.jpg
1 ACCEPTED SOLUTION

avatar
Master Mentor

@BobKing 

ListFile only creates 0 byte (no content) FlowFiles that must be sent to a FetchFile processor to retrieve and add the content to the FlowFile.   The List<abcC> and Fetch<abc> type processors should be used instead of Get<abc> type processor when working In a multi-node NiFi cluster setup.    These processor allow you to run list<abc> on primary node only, load-balance the listed FlowFiles across all nodes in the NIFi cluster, and then fetch<abc> the content. Spread the workload across the cluster for ingest types that don't support cluster setups (ListFile, ListSFTP, ListFTP, etc).

TIP:
Your attached image was so small I could not read the processor types.  When you add an image to a new post you can click on it and expand it size to make it larger by dragging form the corner of the image before hitting "Reply".

Thanks,
Matt

View solution in original post

6 REPLIES 6

avatar
Master Mentor

@BobKing 

Welcome to the Cloudera Community.

It is going to be difficult to determine what is going on here without a sample failing zip file to reproduce with.

What can you tell me about these WinZip files?

  1. How are they generated?
  2. Do they contain any files or only contain directories?  (NiFi on creates FlowFiles for actual content, so zip file containing non files and only a bunch of empty directories would fail to unpack.
  3. Are these multi-part zip files?

Thank you,

Matt

avatar
Explorer

These are WinZip files we get from a government website that are setup for public use and reference. Some may contain a directory structure and some may not. I have verified that files do exist in the zip files.

I am unable to attach a zip file because the .zip extension is not allowed or supported on the Cloudera community site. So I am unable to upload a zip file.

"The file type (.zip) is not supported. Valid file types are: .docx, .xlsx, .pptx, .pdf, .txt, .csv, .png, jpg, .jpeg, .gif, docx, xlsx, pptx, pdf, txt, csv, png, jpeg, gif."

Accessing the link on the government site will require credentials so posting the link will not be helpful either.

Even if I create a WinZip file with a single file in it, I still get the same error.

I also tried CompressContent using the mime type I did in the UpdateAttribute. While I do not get an error, the FlowFile output is a list of files with the same name as the zip file. Does not matter if I have the Update Filename property set to True or False.

avatar
Master Mentor

@BobKing 

I have not been able to reproduce.  I downloaded WinZip, created a simple text file and then zipped it.  I then consumed that .zip file using CFM 2.1.7.1001 and was able to successfully use UnpackContent to unpack the text file.

I recommend, if you have a support contract with Cloudera, that you create a support case where you can share more detail about the problematic zip files and an example file if possible.  I suspect the issue is specific to the zip files you are working with.  

UnpackContent does not support mulit-part zip files (NIFI-10654). 


Thank you,
Matt

avatar
Explorer

Matt,

I figured it out. It was not so much the UnpackContent processor, but how I was bringing in the flowfile itself. I was using ListFile and passing to the UnpackContent processor. I switched to a GetFile processor with a File Filter of .*\.zip and the unzip worked perfectly. So looks like I will need to get the file with GetFile, unzip, and archive the file with PutFile.

avatar
Master Mentor

@BobKing 

ListFile only creates 0 byte (no content) FlowFiles that must be sent to a FetchFile processor to retrieve and add the content to the FlowFile.   The List<abcC> and Fetch<abc> type processors should be used instead of Get<abc> type processor when working In a multi-node NiFi cluster setup.    These processor allow you to run list<abc> on primary node only, load-balance the listed FlowFiles across all nodes in the NIFi cluster, and then fetch<abc> the content. Spread the workload across the cluster for ingest types that don't support cluster setups (ListFile, ListSFTP, ListFTP, etc).

TIP:
Your attached image was so small I could not read the processor types.  When you add an image to a new post you can click on it and expand it size to make it larger by dragging form the corner of the image before hitting "Reply".

Thanks,
Matt

avatar
Visitor

Thanks for clarifying the ListFile vs Get<abc> usage in cluster mode.Hongh_3-1666687375372.png That load-balancing tip really helps when scaling out. I’ll definitely try adjusting the image size next time too, didn’t realize it could be dragged larger before replying.