Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Recommend approach for listening to RSS Feed in Apache Nifi

avatar
Contributor

I want to set up a Nifi flow that gets data from a public RSS feed and loads it into a data lake. This RSS feed updates irregularly and when it does update it overwrites previous content. 

What processor(s) should I use to get data from the RSS feed (close to) when it has updated? Is it as simple as using InvokeHTTP repeatedly, checking for a change in output, then loading into data lake if the content differs from the previous invocation? Is there another way if I don't want to make the HTTP request so frequently?

1 ACCEPTED SOLUTION

avatar
Master Guru

You can't listen for RSS, you have to call them since it's regular HTTP

https://github.com/tspannhw/FLaNK-TravelAdvisory

 

If there's no way to know when the page changes then you can't know.

you can read it once an hour and keep the entire results in a cache (like HBase) and if it doesn't change throw it away

View solution in original post

1 REPLY 1

avatar
Master Guru

You can't listen for RSS, you have to call them since it's regular HTTP

https://github.com/tspannhw/FLaNK-TravelAdvisory

 

If there's no way to know when the page changes then you can't know.

you can read it once an hour and keep the entire results in a cache (like HBase) and if it doesn't change throw it away