Support Questions
Find answers, ask questions, and share your expertise

Rssfeed (xml) check pubDate before ingesting data

Rssfeed (xml) check pubDate before ingesting data

New Contributor

I have a NiFi Flow that ingest XML data and push to HBase Table.  We noticed the XML data that processes (ingested XML data) every 5 mins is the same content.  Is there a way to add a processor to check that data or pubDate to see if it's changed according to previous data pushed to HBase table

2 REPLIES 2

Re: Rssfeed (xml) check pubDate before ingesting data

Master Guru

@melvint 

 

You might look in to using the "HashContent" and "DetectDuplicate" processors.  You can create a HASH for the content of each of your FlowFIles and use DetectDuplicate to see if a FlowFiles hash matches a previous hash that already processed.  If so, the duplicate is routed out of your regular dataflow path so you don't sent it HBase.

 

Hope this helps,

Matt

Re: Rssfeed (xml) check pubDate before ingesting data

New Contributor

@MattWho... Thanks for the response.  I've looked at that.  I've been trying to get it to work with the Hash and DetectDuplicate processors and without adding the Hash processor.  I must be doing something wrong... Is there an example or something I could look at...

 

Is there a way to upload my current Nifi Flow