Support Questions

Find answers, ask questions, and share your expertise

NIFI : MultiDirectories for ListHDFS

avatar
Rising Star

hi all,

I see that listHDFS processor has Directory parameter support Expression Language

So do you know how set mutli-directories for one processor listHDFS ?

Directory : /tmp/{toto|truc} 

Thanks

11 REPLIES 11

avatar
Master Guru

If you set it to /tmp and set Recurse Sub Directories to true, then it will list both.

avatar
Rising Star

@Bryan : i know for Recurse value, but my dataflow will delete only files in /tmp/toto and /tmp/truc but not in /tmp/Keep and /tmp/News so I can't set only /tmp

avatar
Master Mentor

@mayki wogno

Setting /tmp will cause listHDFS to produce a listing of files in all 4 of your directories. Following that listing, use a routeOnAttrinute processor to auto-terminate andy listing that were not from /tmp/toto or /tmp/truc before feeding what FlowFiles are left down the rest of your dataflow.

avatar
Rising Star

@Matt: thanks.

avatar
Rising Star

Hi again,

In my directory /user/prod/ i got thousand files.

How it is possible to configure listHDFS to list only some directories like /user/prod/201703*

So list only all directory 2017 of March ?

thanks

avatar
Rising Star

What's the best ways to remove old files in the list of directories?

Need i generate one listHDFS processor by same subpath ?

/horton/catalogue_od/lighter/YYYYMMDD
/horton/google/catalogue/YYYYMMDD
/horton/google/channel/lighter/YYYYMMDD
/horton/google/lighter/YYYYMMDD
/horton/mad/lighter/YYYYMMDD
/horton/optinoptout/YYYYMMDD
/horton/mdr/YYYY/WW
/horton/macbymac/YYYYMMDD
/horton/purchase/YYYY/WW
/horton/paris/filtered/YYYY/WW
/horton/paris/finaltastebox/YYYY/WW
/horton/paris/genetic/YYYY/WW
/horton/paris/similarpattern/YYYY/WW
/horton/paris/substithortonpattern/YYYY/WW
/horton/paris/tastebox/YYYY/WW
/horton/paris/uniqtastebox/YYYY/WW
/horton/scoring/exlibris/input/bestchannels/YYYYMMDD
/horton/scoring/exlibris/input/full/YYYYMMDD
/horton/scoring/exlibris/output/YYYYMMDD
/horton/scoring/input/YYYY/WW
/horton/scoring/output/YYYYMMDD
/horton/statistic/customer/YYYY/WW
/horton/stb/lighter/YYYYMMDD
/horton/tvep/YYYYMMDD
/horton/vod/YYYYMMDD

/shared/generated/adobe/YYYYMMDD
/shared/generated/comscore/YYYYMMDD
/shared/generated/google/YYYYMMDD
/shared/generated/amazon/logs/YYYYMMDD
/shared/generated/amazon/tvep/YYYYMMDD
/shared/generated/amazonstats/YYYYMMDD
/shared/generated/star/YYYYMMDD

/celluleScores/YYYY/WW
/jedi/optinoptout/YYYYMMDD
/jedi/profil/YYYYMMDD


avatar
Master Guru

In the latest code in master, there is an improvement to ListHDFS to add a new property:

public static final PropertyDescriptor FILE_FILTER = new PropertyDescriptor.Builder()
    .name("File Filter")
    .description("Only files whose names match the given regular expression will be picked up")
    .required(true)
    .defaultValue("[^\\.].*")
    .addValidator(StandardValidators.REGULAR_EXPRESSION_VALIDATOR).build();




Does this help?

avatar
Rising Star

@Bryan, it comes with next version of NIFI ?

avatar
Master Guru

Yes its in the latest code that hasn't been released yet, so it would be in the next version which will likely be Apache NiFi 1.2.