Created 12-13-2018 11:14 AM
I have a ReplaceText that generates a Hive query. As an output flowfile, I have several records with the same Hive query, that I use to run a PutHiveQl processor. So that, the PutHiveQL processor is executed several times, but I only need it to run once.
How could I do that? Is there a way to remove all ReplaceText output records but one?
Any help would be really appreciated.
Created 12-13-2018 01:33 PM
Use DetectDuplicate processor, so that if you are having same hive query as this processor will detect that query transfers those flowfiles into duplicate connection and also you can use Age Off Duration property to age off cached flowfiles.
Refer to this link for more details regards to DetectDuplicate processor usage.
-
Another ways would be using ControlRate processor but this processor will release the flowfile that has been waiting in the queue after TimeDuration has been finished.
Refer this link for more details regards to ControlRate processor.
Created 12-13-2018 01:33 PM
Use DetectDuplicate processor, so that if you are having same hive query as this processor will detect that query transfers those flowfiles into duplicate connection and also you can use Age Off Duration property to age off cached flowfiles.
Refer to this link for more details regards to DetectDuplicate processor usage.
-
Another ways would be using ControlRate processor but this processor will release the flowfile that has been waiting in the queue after TimeDuration has been finished.
Refer this link for more details regards to ControlRate processor.
Created 12-13-2018 04:00 PM
Age of duration seems to be a straight forward solution.
I thought in using DetectDuplicate, but we need here an external db to achieve that.
Thank you very much for the help, Shu.