I'm a little confused by the MonitorActivity in Hortonworks/NiFi data flows. The essence of the processor makes perfect sense, but for a complex data flow that spans many thousands of files the processor seems like it's missing the option to delay or trigger to the time it actually starts to monitor. If I bring down my workflow and restart it then I would like to be able to delay when the monitor actually begins to watch for inactivity. If I have 15-20 minutes of other processing to occur before I even get to the group that needs monitoring then I am guaranteed to get at least one file trigger for inactivity. That seems logical, but at the same time, a little limited. If I could trigger the monitor on the first inbound flowFile on the monitored process then all would fall into how I would expect the processor to behave: InvokeHttp (on first flowfile trigger the monitor) -> MonitorActivity (starts to monitor on first flowfile in InvokeHttp) -> Next Processor is triggered upon the InvokeHttp being inactive for 2 minutes. Thoughts, and/or ideas to get around this limitation?
... View more
Really simple use case, but extremely frustrating issue. Processors Used: - ListS3 for a prefix (directory path) without filename (so multiple files) included - I see the files I want in the queue a few seconds later correctly named with full prefix - Run FetchS3Object - On the other side of this processor there is something odd: It's at this point that I can see the filenames are wacky. Sometimes the filenames are from other locations in s3, other times they will be the name of file that I haven't pulled in weeks. Seems like we're in a corrupt state, but I don't know how to get back to 0. The file contents are CORRECT, this is an issue with the filename property. Any thoughts?
... View more