Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

How to determine if files exist in HDFS directory

Rising Star

As the last step in my process, I need to check to see if any more files exist in an HDFS directory. I tried using FetchHDFS which can take an existing flow file (unlike ListHDFS which won't accept an incoming flow file), but I discovered the hard way that FetchHDFS can't take wildcards, only an HDFS path and filename. I looked for, but can't find anything on calling existing Java HDFS methods from ExecuteScript and groovy. I was hoping not to need to build a custom processor. The only option I've come up with so far is to write a small standalone Java app and call it using ExecuteStreamCommand. But that loads a JVM every time (presumably). Any other ideas?


Master Guru

@Jeff Watson

Could you try using GetHDFSFileInfo processor, as this processor accepts incoming connections and regex to match only the required directories/files/exclude files..!