I have a NiFi Flow that ingest XML data and push to HBase Table. We noticed the XML data that processes (ingested XML data) every 5 mins is the same content. Is there a way to add a processor to check that data or pubDate to see if it's changed according to previous data pushed to HBase table
You might look in to using the "HashContent" and "DetectDuplicate" processors. You can create a HASH for the content of each of your FlowFIles and use DetectDuplicate to see if a FlowFiles hash matches a previous hash that already processed. If so, the duplicate is routed out of your regular dataflow path so you don't sent it HBase.
Hope this helps,
@MattWho... Thanks for the response. I've looked at that. I've been trying to get it to work with the Hash and DetectDuplicate processors and without adding the Hash processor. I must be doing something wrong... Is there an example or something I could look at...
Is there a way to upload my current Nifi Flow