Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.
I'm attempting to run listfile against a folder that contains folders for which i both do and don't have permissions to. I've set a path filter to skip the folder I don't have permission to, because the listfile processor dies as soon as it encounters an access denied, instead of continuing. Even with a path filter in place, it still errors out. I've even set an impossible filter on both file and path and it still attempts to scan all folders and errors. My assumption is, that listfile scans everything in the folder that you've told it to, and then later uses the filters to scan the data it has in a list.
How in the world do I filter folders i don't have access to? The folder i want skipped is .Trash-12351.
I've added a path filter for skipping all folders that are hidden : [^\.] and it still errors
I've added a path filter to ONLY read the folders i want with the syntax 1243N-BLAH-125N with ^\d+.* as tested on https://regexr.com/ , and it still tosses a java.nio.file.accessdeniedexception and lists a .trash folder that I've told it to skip.
I've even added fake file and path filters with ^blahblahblah, telling it to ONLY read files and folders that start with ^blahblahblah, and it still throws a java.nio.file.accessdeniedexception and lists the trash folder I've told it to ignore.
This tells me that nifi just scans everything, generates a list, and then runs the regex against the list before scanning each file for the details we're asking for, thereby making it impossible to use listfile on anything other than folders in which you have access to every single file.
file filter [^\.].*
path filters tried to skip the .trash folder:
^fawkIt <-- Was attempting to see if it even looked at the path filter, and it was still attempting to list with this impossible requirement
... View more