Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Transfer files to S3 based on file timestamp

Solved Go to solution

Transfer files to S3 based on file timestamp

Contributor

I have used case where i am reading files with timestamp and these files has to be transferred to S3 and create folder with respective dates. Ex: file names abcd.out.gz.20200303 , abcd.out.gz.20200302

and the file abcd.out.gz.20200303 need to be in S3 under /data/20200303

and file abcd.out.gz.20200302 under /data/20200302.

 

How can i achieve this in NiFi. 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Transfer files to S3 based on file timestamp

Contributor

I got solution for this. Had to use expression language in Object Key to fetch date from file and it worked. Below is expression :${filename:substringAfter('.gz.')}/${filename}

View solution in original post

4 REPLIES 4
Highlighted

Re: Transfer files to S3 based on file timestamp

New Contributor

1. ListS3:List all object file path in the specific bucket.

2. RouteOnAttribute:Filter out unused file (optional).

3. FetchS3Object :Fetch file.

4. UpdateAttribute:Rename filename(file path in bucket) to the specific path.  ex:

5. PutS3Object :Put file to the specific bucket.

 

PS:Once execute FetchS3Object, files will be load in memory. So, it's better to limit Back Pressure Object Threshold or Size Threshold in the Connection after FetchS3Object.

 

 

Highlighted

Re: Transfer files to S3 based on file timestamp

Contributor

@AustinLiu : But i need to transfer file abcd_20200303 to S3 folder 20200303 and respectively based on dates. Every day when the files arrive my processor should identify file based on date and push it to respective date folder in S3.

Highlighted

Re: Transfer files to S3 based on file timestamp

Contributor

@AustinLiu Just to clarify, i am transferring files from linux box to S3. 

Highlighted

Re: Transfer files to S3 based on file timestamp

Contributor

I got solution for this. Had to use expression language in Object Key to fetch date from file and it worked. Below is expression :${filename:substringAfter('.gz.')}/${filename}

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here