Created 03-06-2017 11:24 AM
Is there any way in a NiFi PutHDFS processor to capture the error message that causes failure and store it in an Attribute? For example, if the Put fails due to misconfiguration of HDFS that must be handled differently to a duplicate file. Is there any way to capture this error message in a flow files attributes?
Created 03-06-2017 03:28 PM
While I like the idea, there is currently no way to have a log message written to a FlowFiles attribute upon routing to a failure relationship. You may want to open an Apache NiFi Jira around this idea.
Typically the "failure" relationship is routed back on the source processor so that multiple attempts can be made to deliver the file. In cases like network hicups, duplicate files, etc. this makes a lot of sense. When dealing with processor config failures, permissions issues, etc. the file will never be successful.
You could set up a failure count loop. This loop would create an attribute on FlowFiles that are routed to "failure" and continue to loop them back on PutHDFS until the count has reached a configured number. Once that count is reached, the FlowFiles could be routed out of the loop. You could then send a notification via putEmail of the failed FlowFile for user investigation.
Here is a link to a retry count loop flow NiFi template:
Thanks,
Matt
Created 03-06-2017 03:28 PM
While I like the idea, there is currently no way to have a log message written to a FlowFiles attribute upon routing to a failure relationship. You may want to open an Apache NiFi Jira around this idea.
Typically the "failure" relationship is routed back on the source processor so that multiple attempts can be made to deliver the file. In cases like network hicups, duplicate files, etc. this makes a lot of sense. When dealing with processor config failures, permissions issues, etc. the file will never be successful.
You could set up a failure count loop. This loop would create an attribute on FlowFiles that are routed to "failure" and continue to loop them back on PutHDFS until the count has reached a configured number. Once that count is reached, the FlowFiles could be routed out of the loop. You could then send a notification via putEmail of the failed FlowFile for user investigation.
Here is a link to a retry count loop flow NiFi template:
Thanks,
Matt
Created 03-06-2017 04:42 PM
If there are specific error scenarios that we want to handle differently, we may want to have additional failure relationships, like "failure_duplicate". This way the processor itself would detect this and route the flow file to the appropriate relationship.
Created 03-07-2017 05:20 AM
I was hoping to see that kind of relationship initially, and thereafter hoping that the cause of failure would be stored in an attribute. As @Matt Clarke suggested, I think I will open a Jira for this.
Created 03-07-2017 05:37 AM
@Matt Clarke Thanks for this info. We do currently have a failure count loop as you suggested, which will eventually dump files in an error bucket for reprocessing later. I was just hoping to be able to identify duplicates directly from the attributes themselves. I think I will open a Jira for this.