Support Questions

Find answers, ask questions, and share your expertise

NiFi with updateattribute

avatar
Contributor

Hi All ,

In my project , where I am trying to read log files and process it in spark , I am using NiFi to read the file from tomcat log folder location and copy it to my Edge node in my hadoop cluster.

But the problem is that my Application (for which I am processing log files) is in cluster environment and in all 4 tomcat cluster log file names are same. So what I want to do , getFTP will get the log file from app server location , then data will flow into a updateAttribite processor , which will append the server and cluster identification(just something like server1Cluste1 or server2Cluster1) with the file name and then putFile will store the log file in local file system with new name. Which I will process in spark job.

Can any one help me out for configuration of updateAttribute in my case? is there anything in updateAttribute by which I can identify from which server I am getting this file and depending on that can I change the file name to putFile?

Any help will be highly appreciated

Thanks in advance

1 ACCEPTED SOLUTION

avatar
Master Guru

@Biswajit Chakraborty

If you are using GetFTP processor then after pulling files then processor going to add getftp.remote.source attribute to the flowfile, then you can use this flowfile attribute then prepare filename in update attribute processor

Add new property in update attribute

filename

${filename}_${getftp.remote.source} //add remote source name to the filename 

56573-updateattribute.png

as you can change the way of using expression language to change filename as following

${filename:append(${getftp.remote.source})} //result 711866091328995HDF04-1
(or)
${filename}${getftp.remote.source} //result 711866091328995HDF04-1

Example:-

if you are having filename value as 711866091328995 and getftp.remote.source value as HDF04-1 then output flowfile from update attribute will have filename as

711866091328995_HDF04-1 //because we are adding remote source value to filename with underscore

56571-updateattribute.png

(or)

if you are having issue with the same filenames and they are getting overwritten,

The FlowFile will also have an attribute named uuid, By using UUID(which is a unique identifier for this FlowFile) as filename, will keep every filename as unique so that we are not going to have any overwriting issues.

Configs:-

filename

${uuid}

56572-updateattribute.png

View solution in original post

1 REPLY 1

avatar
Master Guru

@Biswajit Chakraborty

If you are using GetFTP processor then after pulling files then processor going to add getftp.remote.source attribute to the flowfile, then you can use this flowfile attribute then prepare filename in update attribute processor

Add new property in update attribute

filename

${filename}_${getftp.remote.source} //add remote source name to the filename 

56573-updateattribute.png

as you can change the way of using expression language to change filename as following

${filename:append(${getftp.remote.source})} //result 711866091328995HDF04-1
(or)
${filename}${getftp.remote.source} //result 711866091328995HDF04-1

Example:-

if you are having filename value as 711866091328995 and getftp.remote.source value as HDF04-1 then output flowfile from update attribute will have filename as

711866091328995_HDF04-1 //because we are adding remote source value to filename with underscore

56571-updateattribute.png

(or)

if you are having issue with the same filenames and they are getting overwritten,

The FlowFile will also have an attribute named uuid, By using UUID(which is a unique identifier for this FlowFile) as filename, will keep every filename as unique so that we are not going to have any overwriting issues.

Configs:-

filename

${uuid}

56572-updateattribute.png