Support Questions

Find answers, ask questions, and share your expertise

How to determine if files exist in HDFS directory

avatar
Rising Star

As the last step in my process, I need to check to see if any more files exist in an HDFS directory. I tried using FetchHDFS which can take an existing flow file (unlike ListHDFS which won't accept an incoming flow file), but I discovered the hard way that FetchHDFS can't take wildcards, only an HDFS path and filename. I looked for, but can't find anything on calling existing Java HDFS methods from ExecuteScript and groovy. I was hoping not to need to build a custom processor. The only option I've come up with so far is to write a small standalone Java app and call it using ExecuteStreamCommand. But that loads a JVM every time (presumably). Any other ideas?

1 REPLY 1

avatar
Master Guru

@Jeff Watson

Could you try using GetHDFSFileInfo processor, as this processor accepts incoming connections and regex to match only the required directories/files/exclude files..!