Support Questions
Find answers, ask questions, and share your expertise

Move entire folder based on date to S3

Contributor

Used case :  Need to transfer entire folder of files based on date to S3.

Ex :  Source Path (Linux) : /users/abc/20200312/gtry/gyyy.csv   

         S3    : /Users/datastore/20200312/gyyy.csv

 

Since dates keep changing everyday, and i need to build dataflow which would pick files from date folder

1 ACCEPTED SOLUTION

Accepted Solutions

Master Guru

@Gubbi 

Depending on which processor is being used to create your FlowFile from you source linux directory, you will likely have an "absolute.path" FlowFile attribute created on the FlowFile.

absolute.path = /users/abc/20200312/gtry/

 

You can pass that FlowFile to an UpdateAttribute processor which can use NiFi Expression Language (EL) to extract the date from that absolute path in to a new FlowFile attribute

Add new property (property name becomes new FlowFile attribute):

Property:          Value:

pathDate          ${absolute.path:getDelimitedField('4','/')}

 

The resulting FlowFile will have a new attribute:
pathDate = 20200312

Now you can use that FlowFile attribute later when writing to your target directory in S3.

I assume you would use the putS3Object processor for this?
If so, you can configure the "Object Key" property with the following:

/Users/datastore/${pathDate}/${filename}

 

NiFi EL will replace ${pathDate} with "20200312" and ${filename} will be replaced with "gyyy.csv".

 

Hope this helps you,

Matt

View solution in original post

3 REPLIES 3

Contributor

@Shu_ashu  : I saw you had suggested similar pattern solution before. Could you please look into this and suggest approach.

 

https://community.cloudera.com/t5/Support-Questions/NiFi-Creating-the-output-directory-from-the-cont...

Contributor

@MattWho : Appreciate your inputs as well

Master Guru

@Gubbi 

Depending on which processor is being used to create your FlowFile from you source linux directory, you will likely have an "absolute.path" FlowFile attribute created on the FlowFile.

absolute.path = /users/abc/20200312/gtry/

 

You can pass that FlowFile to an UpdateAttribute processor which can use NiFi Expression Language (EL) to extract the date from that absolute path in to a new FlowFile attribute

Add new property (property name becomes new FlowFile attribute):

Property:          Value:

pathDate          ${absolute.path:getDelimitedField('4','/')}

 

The resulting FlowFile will have a new attribute:
pathDate = 20200312

Now you can use that FlowFile attribute later when writing to your target directory in S3.

I assume you would use the putS3Object processor for this?
If so, you can configure the "Object Key" property with the following:

/Users/datastore/${pathDate}/${filename}

 

NiFi EL will replace ${pathDate} with "20200312" and ${filename} will be replaced with "gyyy.csv".

 

Hope this helps you,

Matt

View solution in original post