Support Questions

Find answers, ask questions, and share your expertise

Move File from One local folder to another and then move to HDFS

avatar
Contributor

Hi,

I have a scenario where I have to move the file from local folder1 to local folder2 and then move that file from local (folder2) to HDFS.

I proposed the following flow:

GetFile -> PutFile - > Get File > PutHDFS

The problem is that I am unable to connect the Putfile with GetFile its not allowing me to drag the relationship.

Actually, what I want is, once the file is moved from local folder1 to local folder2 then I will be moving it to HDFS.

4 REPLIES 4

avatar
Rising Star

Hi,

yes, this works as intended. GetFile is a Flow-starting processor, you can not connect to it from other processors - think about it like a process instance trigger.

Please use the FetchFile Processor:

GetFile -> PutFile - > Fetch File > PutHDFS

Hope that helps.

avatar
Contributor

@Peter Greiff

Thanks for your answer and its working. But I have one concern here, in fetch file I have to mention the file name along with absolute path. Lets say, in future I want to process the some other file with different name but the location so, I wont be able to use this data flow. Is there any way around to make it somehow generic such that a fetch file can process any file on given path? And I dont have to change the filename explicitly in fetch file configuration.

avatar
Rising Star

Actually you do not need to assign the values fix. You can pass the file-name and path dynamically to the next processors.

Please check out the documentation at https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#flowfile

For example, ${filename} will return the value of the “filename” attribute.

Other values in this context are:

  • Filename ("filename"): The filename of the FlowFile. The filename should not contain any directory structure.
  • UUID ("uuid"): A unique universally unique identifier (UUID) assigned to this FlowFile.
  • Path ("path"): The FlowFile’s path indicates the relative directory to which a FlowFile belongs and does not contain the filename.
  • Absolute Path ("absolute.path"): The FlowFile’s absolute path indicates the absolute directory to which a FlowFile belongs and does not contain the filename.

avatar
New Contributor

ExecuteStreamCommand processor can be event driven. Use this to run a shell script which moves the file into the folder when getfile is monitoring.



109795-1562353150747.png