Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NIFI : MultiDirectories for ListHDFS

Highlighted

NIFI : MultiDirectories for ListHDFS

Explorer

hi all,

I see that listHDFS processor has Directory parameter support Expression Language

So do you know how set mutli-directories for one processor listHDFS ?

Directory : /tmp/{toto|truc} 

Thanks

11 REPLIES 11
Highlighted

Re: NIFI : MultiDirectories for ListHDFS

If you set it to /tmp and set Recurse Sub Directories to true, then it will list both.

Highlighted

Re: NIFI : MultiDirectories for ListHDFS

Explorer

@Bryan : i know for Recurse value, but my dataflow will delete only files in /tmp/toto and /tmp/truc but not in /tmp/Keep and /tmp/News so I can't set only /tmp

Highlighted

Re: NIFI : MultiDirectories for ListHDFS

Master Guru

@mayki wogno

Setting /tmp will cause listHDFS to produce a listing of files in all 4 of your directories. Following that listing, use a routeOnAttrinute processor to auto-terminate andy listing that were not from /tmp/toto or /tmp/truc before feeding what FlowFiles are left down the rest of your dataflow.

Re: NIFI : MultiDirectories for ListHDFS

Explorer

@Matt: thanks.

Highlighted

Re: NIFI : MultiDirectories for ListHDFS

Explorer

Hi again,

In my directory /user/prod/ i got thousand files.

How it is possible to configure listHDFS to list only some directories like /user/prod/201703*

So list only all directory 2017 of March ?

thanks

Highlighted

Re: NIFI : MultiDirectories for ListHDFS

Explorer

What's the best ways to remove old files in the list of directories?

Need i generate one listHDFS processor by same subpath ?

/horton/catalogue_od/lighter/YYYYMMDD
/horton/google/catalogue/YYYYMMDD
/horton/google/channel/lighter/YYYYMMDD
/horton/google/lighter/YYYYMMDD
/horton/mad/lighter/YYYYMMDD
/horton/optinoptout/YYYYMMDD
/horton/mdr/YYYY/WW
/horton/macbymac/YYYYMMDD
/horton/purchase/YYYY/WW
/horton/paris/filtered/YYYY/WW
/horton/paris/finaltastebox/YYYY/WW
/horton/paris/genetic/YYYY/WW
/horton/paris/similarpattern/YYYY/WW
/horton/paris/substithortonpattern/YYYY/WW
/horton/paris/tastebox/YYYY/WW
/horton/paris/uniqtastebox/YYYY/WW
/horton/scoring/exlibris/input/bestchannels/YYYYMMDD
/horton/scoring/exlibris/input/full/YYYYMMDD
/horton/scoring/exlibris/output/YYYYMMDD
/horton/scoring/input/YYYY/WW
/horton/scoring/output/YYYYMMDD
/horton/statistic/customer/YYYY/WW
/horton/stb/lighter/YYYYMMDD
/horton/tvep/YYYYMMDD
/horton/vod/YYYYMMDD

/shared/generated/adobe/YYYYMMDD
/shared/generated/comscore/YYYYMMDD
/shared/generated/google/YYYYMMDD
/shared/generated/amazon/logs/YYYYMMDD
/shared/generated/amazon/tvep/YYYYMMDD
/shared/generated/amazonstats/YYYYMMDD
/shared/generated/star/YYYYMMDD

/celluleScores/YYYY/WW
/jedi/optinoptout/YYYYMMDD
/jedi/profil/YYYYMMDD


Highlighted

Re: NIFI : MultiDirectories for ListHDFS

In the latest code in master, there is an improvement to ListHDFS to add a new property:

public static final PropertyDescriptor FILE_FILTER = new PropertyDescriptor.Builder()
    .name("File Filter")
    .description("Only files whose names match the given regular expression will be picked up")
    .required(true)
    .defaultValue("[^\\.].*")
    .addValidator(StandardValidators.REGULAR_EXPRESSION_VALIDATOR).build();




Does this help?

Highlighted

Re: NIFI : MultiDirectories for ListHDFS

Explorer

@Bryan, it comes with next version of NIFI ?

Highlighted

Re: NIFI : MultiDirectories for ListHDFS

Yes its in the latest code that hasn't been released yet, so it would be in the next version which will likely be Apache NiFi 1.2.

Don't have an account?
Coming from Hortonworks? Activate your account here