Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

can getFile processor supports to read files from two different directories ?

Solved Go to solution

can getFile processor supports to read files from two different directories ?

New Contributor

Hi Team,

i want to read files from two different folders. used getFile processor to read files. in properties section i have given one folder path, processor read the files from folder as expected. when i tried to add the other folder name into input directory property it is showing invalid processor. can you please suggest me how to give two folder paths in inptuDirectory property.

2 folder paths:

/company_shared/nas_data/home/walmart/inbox/

/company_shared/nas_data/home/target/inbox/

used pipe to concatenate two folder paths.

getFile processor: inputDirectory property : /company_shared/nas_data/home/ryemireddy/inbox/|/company_shared/nas_data/home/krishna/inbox/

ERROR: invalid, because directory does not exist.

Please suggest me how to give the exact value for inputDirectory property.

Thanks,

Rangareddy Y

1 ACCEPTED SOLUTION

Accepted Solutions

Re: can getFile processor supports to read files from two different directories ?

New Contributor

@Rangareddy Y

As i know GetFile accept single directory. you can use ListFile processor with recursion enabled. To filter out directory you can use path filter. To filter out files you can use file filter

6 REPLIES 6

Re: can getFile processor supports to read files from two different directories ?

Cloudera Employee

@Rangareddy Y Input Directory param of getFile accepts only one directory as input. So you cannot pass multiple directories to getFile

Re: can getFile processor supports to read files from two different directories ?

New Contributor

@Rangareddy Y

As i know GetFile accept single directory. you can use ListFile processor with recursion enabled. To filter out directory you can use path filter. To filter out files you can use file filter

Re: can getFile processor supports to read files from two different directories ?

New Contributor

@ashok.kumar, can you suggest me how path filter for two folders..i tried path filter, if i give one path it is taking, how to add the other one into path filter..
properties given like below

input Directory : /company_shared/nas_data/home/
Path filer : company_shared/nas_data/home/walmart/inbox/

It is reading all the folder files exist on home path. There are other folders in home path, we should not read by listFile processor.

Here how to add other path in path filter, please suggest me.

Thanks,

Rangareddy Y

Re: can getFile processor supports to read files from two different directories ?

New Contributor

@Rangareddy Y

say you have two directory

/company_shared/nas_data/home

&

/company_shared/sas_data/home

in input directory you specify like: /company_shared/

in your path filter you can specify like : nas_data/home|sas_data/home

recursively it will go through all folder of input directory and tries to match it path filter regex. accordingly it will pick files

Re: can getFile processor supports to read files from two different directories ?

New Contributor

@ashok.kumar, it's working fine...whatever the folder paths i mentioned in path filter, processor is listing those folder files only. Thanks,

but, if any file exist in input directory path /company_shared/ , those file also processor is listing... is there any way we can exclude those files...

in my case, we do no keep any files in /compan_shared/ home path. no issue for me.


Thanks,

Rangareddy Y

Re: can getFile processor supports to read files from two different directories ?

New Contributor

@Rangareddy Y

you can take help of file filter.