Support Questions

Find answers, ask questions, and share your expertise

pushing multiple objects in S3 bucket using PushS3Object in nifi

avatar
Explorer

I used a json array and split it into multiple json files and did some operations on it.

Then converted each of these json files to parquet form and i wanted to upload each individual parquet file to S3.

Uploading a single json file or a json array was not an issue, but if i had (for example) 3 parquet files to be uploaded then, only 1 got uploaded into S3, thus the other 2 files are lost.  Thus i am assuming PutS3Object processor is allowing the uploading of only 1 file at any given time (correct me if i am wrong)

That leaves me with two solutions:

1. Either somehow allow multiple files to be uploaded into S3

2. Or merge individual json files to one json array and then doing the conversion and then pushing it to S3

 

I would be grateful if anybody helps me with either of the solutions.

Thanks in advance.

1 REPLY 1

avatar
Super Guru

@P_Rat98 You need to set the filename (Object Key) of each parquet file uniquely to save different S3 files.   If that processor is configure to just ${filename} then it will over write additional executions.

 

For the second option, if you have split in your data flow,  the split parts should have key/value pair for the split and total splits.  Inspect your queue and list attributes on split flowfiles for these attributes.  You use these attributes with MergeContent to remerge everything back together into a single flowfile.   You need to do this before converting to parquet, not after.

 

 

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  

 

Thanks,


Steven @ DFHZ