Support Questions
Find answers, ask questions, and share your expertise

Nifi InputRequirement.FORBIDDEN purpose


Nifi InputRequirement.FORBIDDEN purpose

New Contributor

Some processors are annotated as source processors i.e. they do not accept an incoming flow file. I cannot appreciate the reason for this restriction. What is the idea behind adding this constraint?

I have a requirement of looking up a HDFS folder path from some metadata/ config and then Fetch all the files in that folder and do some further processing on them before writing these out to another sink.

The problem is that once I get the path from metadata, I cannot supply this path to a ListHDFS processor because ListHDFS does not accept an incoming flow file. I cannot appreciate why it was designed this way? @Bryan Bende?

I have faced several such restriction in designing workflows due to the InputRequirement.FORBIDDEN and whats frustrating is that this is not documented anywhere in the Processor documentation that it can only be the first step in the flow.

Can someone explain why the restriction and what is the way around it? Do I need to build a custom version of ListHDFS that will use path from an incoming flow file. I tried WebHDFS and httpFS but I am unable to use this in production because my production cluster is kerberized and InvokeHTTP processor does not seem to support SPNEGO


Re: Nifi InputRequirement.FORBIDDEN purpose

It is up to the developer of a processor to determine if their processor is going to support incoming flow files.

In the case of ListHDFS, the intended use-case was for it to monitor a directory for new files based on using the timestamp of the last files it saw from the previous execution, so it is expecting to the list the same directory each time.

The InputRequirement annotation is a way to indicate to the framework whether incoming connections should be allowed. A while ago this annotation did not exist, and you could connect a queue to any processor, but the processor might have never used the flow files from the queue, and it was very confusing.

So I don't think the issue is the InputRequirement annotation itself, but rather the fact that you are looking for a processor similar to ListHDFS, but with different behavior where you can list any arbitrary directory based on an incoming flow file, and most likely not maintain state to find newer files since you probably are only listing it one time. You could implement a custom processor for this, or if you have the hadoop client on the machine where NiFi is then you just use ExecuteStreamCommand and make a call out to "hadoop fs -ls ${your.dir}" or something like that.

Also, the documentation does show the "Input Requirement"....

Input requirement: 

This component does not allow an incoming relationship.