Support Questions

Find answers, ask questions, and share your expertise

Apache Nifi

avatar
New Contributor

Hello Team,

I need help in geting the below issue fixed.

I have ftp server and in the ftp server i have a path where we will keep our daily files in a folder

 

Path: /user/Mahesh/test

Folder under the path : 202100923,20210924

 

Now i need to transfer the files and the folders and place them in s3 bucket

 

How do i get the folder name. Since i have so many folders do the below command work in update attribute processor

 

${absolute.path:getDelimitedField('4','/')}

 

please advise any other way to extract folder name. Remember this folder name changes daily. and  can have  multiple folders also at a time.

 

 

 

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Cloud_era 

 

Let me see if I understand your use case fully.

You are using the listFTP and FetchFTP processor or the GetFTP processor to pull in files from your FTP server.
In the ListFTP or GetFTP processor you have configured the path as "/user/Mahesh/test" and have set "Search Recursively" to true so that you pull files found in sub directories including " 202100923 and 20210924".  If you are using the GetFTP processor, you should switch to the List and Fetch processor if running on a NiFi cluster.  Also keep in mind that the List/Fetch FTP processors are much newer and provide more configuration options/capabilities not found in the legacy GetFTP processor.

The GetFTP processor creates a FlowFile Attribute "absolute.path" that contains the full path to the file that is consumed.

The ListFTP processor creates a FlowFile Attribute "path" that contains the full path to the file that will  consumed by fetchFTP.

 

So you end up in your example with the above attributes set with:

/user/Mahesh/test/202100923
/user/Mahesh/test/20210924

using the NiFi Expression Language statement "${absolute.path:getDelimitedField('4','/')}" and above examples, what you would have returned is "test" since that is the 4 delimited field.
Field 1 = blank
Field 2 = user
Field 3 = Mahesh
Field 4 = test
Field 5 = 20210924

Field 1 is blank because you set your delimiter as "/" and the string starts with a "/".

 

So setting this to "${absolute.path:getDelimitedField('5','/')}" based on your examples would return either "202100923 or 20210924".  The problem here is what if your absolute.path values are not always 4 directories deep.  for example:
1. /user/Mahesh/test/202100923/subdir1
2. /user/Mahesh/test/20210924/subdir1

3.  /user/Mahesh/test/202100923/subdir1/subdir2
4. /user/Mahesh/test/20210924/subdir1/subdir2

 

Your expression would still return just  "202100923 or 20210924".

I don't know how or where you are using this folder information later in your dataflow(s), so hard to give recommendations on what to do.

But assuming new example i gave above, here are some other NEL options:
${absolute.path:substringAfterLast('/')} would return:
1. subdir1
2. subdir1

3. subdir2

4. subdir2

 

${absolute.path:substringAfter('/user/Mahesh/test')}

1.  /202100923/subdir1
2.  /20210924/subdir1
3.  /202100923/subdir1/subdir2
4.  /20210924/subdir1/subdir2

 

If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@Cloud_era 

 

Let me see if I understand your use case fully.

You are using the listFTP and FetchFTP processor or the GetFTP processor to pull in files from your FTP server.
In the ListFTP or GetFTP processor you have configured the path as "/user/Mahesh/test" and have set "Search Recursively" to true so that you pull files found in sub directories including " 202100923 and 20210924".  If you are using the GetFTP processor, you should switch to the List and Fetch processor if running on a NiFi cluster.  Also keep in mind that the List/Fetch FTP processors are much newer and provide more configuration options/capabilities not found in the legacy GetFTP processor.

The GetFTP processor creates a FlowFile Attribute "absolute.path" that contains the full path to the file that is consumed.

The ListFTP processor creates a FlowFile Attribute "path" that contains the full path to the file that will  consumed by fetchFTP.

 

So you end up in your example with the above attributes set with:

/user/Mahesh/test/202100923
/user/Mahesh/test/20210924

using the NiFi Expression Language statement "${absolute.path:getDelimitedField('4','/')}" and above examples, what you would have returned is "test" since that is the 4 delimited field.
Field 1 = blank
Field 2 = user
Field 3 = Mahesh
Field 4 = test
Field 5 = 20210924

Field 1 is blank because you set your delimiter as "/" and the string starts with a "/".

 

So setting this to "${absolute.path:getDelimitedField('5','/')}" based on your examples would return either "202100923 or 20210924".  The problem here is what if your absolute.path values are not always 4 directories deep.  for example:
1. /user/Mahesh/test/202100923/subdir1
2. /user/Mahesh/test/20210924/subdir1

3.  /user/Mahesh/test/202100923/subdir1/subdir2
4. /user/Mahesh/test/20210924/subdir1/subdir2

 

Your expression would still return just  "202100923 or 20210924".

I don't know how or where you are using this folder information later in your dataflow(s), so hard to give recommendations on what to do.

But assuming new example i gave above, here are some other NEL options:
${absolute.path:substringAfterLast('/')} would return:
1. subdir1
2. subdir1

3. subdir2

4. subdir2

 

${absolute.path:substringAfter('/user/Mahesh/test')}

1.  /202100923/subdir1
2.  /20210924/subdir1
3.  /202100923/subdir1/subdir2
4.  /20210924/subdir1/subdir2

 

If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt

avatar
New Contributor

Hey Matt, 

Thanks for you advise. I have followed the below approach and worked fine for my issue.

Thanks for your help.

 

${absolute.path:getDelimitedField('5','/')}