Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possible to use upstream attributes in GetHDFS?

Solved Go to solution
Highlighted

Is it possible to use upstream attributes in GetHDFS?

New Contributor

I'm working on a Nifi process that will retrieve files from HDFS. I'm using the GetHDFS processor to pull all the files from a specific directory. Ideally, I'd like to use an attribute from an XML properties file as the HDFS directory in the event we need to change directories. This would allow us to do that without having to change Nifi.

The problem I am having is how to get the GetHDFS processor to recognize the attributes created by my upstream EvaluateXPath processor since GetHDFS does not accept upstream connections.

I'm still relatively new to Nifi so I'm completely stumped as to how, if at all, I can get this to work. Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Is it possible to use upstream attributes in GetHDFS?

New Contributor

In order to resolve my issue I had to write a bash script to retrieve a listing of the files names from my HDFS folder. That gave me the list of ALL files currently in the directory. I am then able to use the FetHDFS processor to retrieve each of the files by file name.

3 REPLIES 3

Re: Is it possible to use upstream attributes in GetHDFS?

@Mike Bailey

Have you tried ListHDFS/FetchHDFS to implement your logic:

EvaluateXPath parses the XML file and get you the name of the directory. You use this to list all files in that directory and then fetchHDFS to get the data. The state you were refering to is to avoir get the same file several time and get only new file from a directory. I expect this to be the desired behavior.

Re: Is it possible to use upstream attributes in GetHDFS?

New Contributor

In order to resolve my issue I had to write a bash script to retrieve a listing of the files names from my HDFS folder. That gave me the list of ALL files currently in the directory. I am then able to use the FetHDFS processor to retrieve each of the files by file name.

Re: Is it possible to use upstream attributes in GetHDFS?

New Contributor

Hi Abdelkrim,

Can you explain how EvaluateXPath will pass directory attribute to ListHDFS? What is best practices when demand is to provide directory for ListHDFS dynamically?

Thanks,

Algis

Don't have an account?
Coming from Hortonworks? Activate your account here