Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Move entire folder based on date to S3

Solved Go to solution
Highlighted

Move entire folder based on date to S3

Contributor

Used case :  Need to transfer entire folder of files based on date to S3.

Ex :  Source Path (Linux) : /users/abc/20200312/gtry/gyyy.csv   

         S3    : /Users/datastore/20200312/gyyy.csv

 

Since dates keep changing everyday, and i need to build dataflow which would pick files from date folder

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Move entire folder based on date to S3

Master Guru

@Gubbi 

Depending on which processor is being used to create your FlowFile from you source linux directory, you will likely have an "absolute.path" FlowFile attribute created on the FlowFile.

absolute.path = /users/abc/20200312/gtry/

 

You can pass that FlowFile to an UpdateAttribute processor which can use NiFi Expression Language (EL) to extract the date from that absolute path in to a new FlowFile attribute

Add new property (property name becomes new FlowFile attribute):

Property:          Value:

pathDate          ${absolute.path:getDelimitedField('4','/')}

 

The resulting FlowFile will have a new attribute:
pathDate = 20200312

Now you can use that FlowFile attribute later when writing to your target directory in S3.

I assume you would use the putS3Object processor for this?
If so, you can configure the "Object Key" property with the following:

/Users/datastore/${pathDate}/${filename}

 

NiFi EL will replace ${pathDate} with "20200312" and ${filename} will be replaced with "gyyy.csv".

 

Hope this helps you,

Matt

View solution in original post

3 REPLIES 3
Highlighted

Re: Move entire folder based on date to S3

Contributor

@Shu_ashu  : I saw you had suggested similar pattern solution before. Could you please look into this and suggest approach.

 

https://community.cloudera.com/t5/Support-Questions/NiFi-Creating-the-output-directory-from-the-cont...

Highlighted

Re: Move entire folder based on date to S3

Contributor

@MattWho : Appreciate your inputs as well

Re: Move entire folder based on date to S3

Master Guru

@Gubbi 

Depending on which processor is being used to create your FlowFile from you source linux directory, you will likely have an "absolute.path" FlowFile attribute created on the FlowFile.

absolute.path = /users/abc/20200312/gtry/

 

You can pass that FlowFile to an UpdateAttribute processor which can use NiFi Expression Language (EL) to extract the date from that absolute path in to a new FlowFile attribute

Add new property (property name becomes new FlowFile attribute):

Property:          Value:

pathDate          ${absolute.path:getDelimitedField('4','/')}

 

The resulting FlowFile will have a new attribute:
pathDate = 20200312

Now you can use that FlowFile attribute later when writing to your target directory in S3.

I assume you would use the putS3Object processor for this?
If so, you can configure the "Object Key" property with the following:

/Users/datastore/${pathDate}/${filename}

 

NiFi EL will replace ${pathDate} with "20200312" and ${filename} will be replaced with "gyyy.csv".

 

Hope this helps you,

Matt

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here