Copying files within HDFS using wildcard in NiFi

In my flow I need to copy HDFS files based on dynamic wildcard to another HDFS location within the same cluster.

I have Process Group Variables:

  • source_path = 'hdfs:///source/'
  • file_prefix = 'myflow_'

And a Flowfile Attribute:

  • file_timestamp = '20190520'

The source directory contains 4 files, and I need to copy the bolded two of them. The filenames to be copied are "${source_path}${file_prefix}${file_timestamp}.part*".





The MoveHDFS processor in NiFi v1.8.0 does not support Expression Language in the File Filter Regex field. How could I achieve this functionality - except for using ExecuteStreamCommand with "hdfs dfs -cp"?

Re: Copying files within HDFS using wildcard in NiFi

@Piotr Grzegorski

Try using ListHDFS + FetchHDFS processors.

You can simulate MoveHDFS processor with the below Flow:

ListHDFS //list all the files in HDFS directory
RouteOnAttribute //Use nifi expression language to filter out the required files
FetchHDFS //fetch the files from HDFS
PutHDFS //put the files into HDFS directory.
DeleteHDFS //delete the file from HDFS directory that are pulled from FetchHDFS


Re: Copying files within HDFS using wildcard in NiFi

Thank you for your answer.

The problem is ListHDFS is a starting processor - it does not accept incoming connections, so I can't provide the changing input directory using a flowfile.

