Member since
09-08-2023
3
Posts
0
Kudos Received
0
Solutions
09-11-2023
10:04 AM
@SAMSAL just for my understanding and as a potential workaround, when I call UnpackContent, the items themselves are stored in memory or on disk? I'm assuming if in memory, if I call UpdateAttributes on absolute.path to make it unique immediately after UnpackContent and the rest of my flow would still work? There is no physical file to move on disk after the Unpack?
... View more
09-08-2023
12:50 PM
After the UnpackContent, we actually write them to S3 with PutS3Object. The S3 Keys are generated to be unique from context how you suggest, but the problem is that the underlying file content is corrupted if file2.zip extracts while file1.zip content is waiting to write to S3.
... View more
09-08-2023
11:32 AM
Hi, I am using UnpackContent to extract files from a zip and then load them to another location. I am seeing corruption in the files going to that other location that I think is caused by some race condition on the extracted files. The zip files have a directory and file structure that look something like this file1.zip -> data/file.txt file2.zip -> data/file.txt where the contents of file.txt in the two zip files is different but they are named the same. We use UnpackContent with Packaging Format 'zip' and File Filter 'data' to only get files in the data directory. This works fine when processing individual files, but now that we have scaled this up, there appears to be an issue where the file extracted from file1.zip gets overwitten by the extracted file from file2.zip and then when we copy the file over, the content is corrupted between the two. Looking at the properties after the UnpackContent, I see absolute.path is something like <NiFi Location>/data/file.txt after extraction so I think they are just corrupting by extracting both files to the same path before they can be moved to the next location. Is there any way to change where UnpackContent puts these files so they don't clobber each other? Maybe something like <NiFi Location>/file1/data/file.txt?
... View more
Labels:
- Labels:
-
Apache NiFi