Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Capture error message in NiFi PutHDFS

avatar
Expert Contributor

Is there any way in a NiFi PutHDFS processor to capture the error message that causes failure and store it in an Attribute? For example, if the Put fails due to misconfiguration of HDFS that must be handled differently to a duplicate file. Is there any way to capture this error message in a flow files attributes?

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Mark Heydenrych

While I like the idea, there is currently no way to have a log message written to a FlowFiles attribute upon routing to a failure relationship. You may want to open an Apache NiFi Jira around this idea.

Typically the "failure" relationship is routed back on the source processor so that multiple attempts can be made to deliver the file. In cases like network hicups, duplicate files, etc. this makes a lot of sense. When dealing with processor config failures, permissions issues, etc. the file will never be successful.

You could set up a failure count loop. This loop would create an attribute on FlowFiles that are routed to "failure" and continue to loop them back on PutHDFS until the count has reached a configured number. Once that count is reached, the FlowFiles could be routed out of the loop. You could then send a notification via putEmail of the failed FlowFile for user investigation.

Here is a link to a retry count loop flow NiFi template:

https://cwiki.apache.org/confluence/download/attachments/57904847/Retry_Count_Loop.xml?version=1&mod...

Thanks,

Matt

View solution in original post

4 REPLIES 4

avatar
Master Mentor
@Mark Heydenrych

While I like the idea, there is currently no way to have a log message written to a FlowFiles attribute upon routing to a failure relationship. You may want to open an Apache NiFi Jira around this idea.

Typically the "failure" relationship is routed back on the source processor so that multiple attempts can be made to deliver the file. In cases like network hicups, duplicate files, etc. this makes a lot of sense. When dealing with processor config failures, permissions issues, etc. the file will never be successful.

You could set up a failure count loop. This loop would create an attribute on FlowFiles that are routed to "failure" and continue to loop them back on PutHDFS until the count has reached a configured number. Once that count is reached, the FlowFiles could be routed out of the loop. You could then send a notification via putEmail of the failed FlowFile for user investigation.

Here is a link to a retry count loop flow NiFi template:

https://cwiki.apache.org/confluence/download/attachments/57904847/Retry_Count_Loop.xml?version=1&mod...

Thanks,

Matt

avatar
Master Guru

If there are specific error scenarios that we want to handle differently, we may want to have additional failure relationships, like "failure_duplicate". This way the processor itself would detect this and route the flow file to the appropriate relationship.

avatar
Expert Contributor

I was hoping to see that kind of relationship initially, and thereafter hoping that the cause of failure would be stored in an attribute. As @Matt Clarke suggested, I think I will open a Jira for this.

avatar
Expert Contributor

@Matt Clarke Thanks for this info. We do currently have a failure count loop as you suggested, which will eventually dump files in an error bucket for reprocessing later. I was just hoping to be able to identify duplicates directly from the attributes themselves. I think I will open a Jira for this.