Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Can we ingest Rss feeds using apache nifi into hive?

avatar
Expert Contributor

Any thoughts on which processors to use in nifi for ingesting rss feeds into hive tables.

http://feeds.bbci.co.uk/news/world/rss.xml

thanks.

1 ACCEPTED SOLUTION

avatar
Master Mentor
@surender nath reddy kudumula

Use this GetHTTP

http://www.nifi.rocks/getting-started-with-apache-nifi/

The URL we are going to grab a file from is XKCD’s rss feed, rss.xml. The GetHTTP procesor is simple to configure and just needs the URL property to be set to http://xkcd.com/rss.xml. Click on the value across from the URL property and enter http://xkcd.com/rss.xml.

Now drag down another processor, EvaluateXPath. Under the properties for this processor, set the following property-value pairs:

  • Destination - flowfile-attribute
  • Return Type - auto-detect

View solution in original post

7 REPLIES 7

avatar
Master Mentor
@surender nath reddy kudumula

Use this GetHTTP

http://www.nifi.rocks/getting-started-with-apache-nifi/

The URL we are going to grab a file from is XKCD’s rss feed, rss.xml. The GetHTTP procesor is simple to configure and just needs the URL property to be set to http://xkcd.com/rss.xml. Click on the value across from the URL property and enter http://xkcd.com/rss.xml.

Now drag down another processor, EvaluateXPath. Under the properties for this processor, set the following property-value pairs:

  • Destination - flowfile-attribute
  • Return Type - auto-detect

avatar
Expert Contributor

Thank you @Neeraj Sabharwal. I ll give this a try. Why do we need EvaluateXPath here? Also i beleive at the end use puthdfs to route the data into hdfs. I beleive currently nifi doesnt have a processor to ingest directly into hive table i can see it has putsql. So i beleive best approach is to use puthdfs. Please let me know thanks..

avatar
Master Mentor

@surender nath reddy kudumula I will put data in HDFS and create external Hive table on top of it.

avatar
Expert Contributor

thank you @Neeraj Sabharwal

avatar
Master Mentor

@surender nath reddy kudumula As part of the best practice, please accept the best answer to close the thread.

avatar
Master Collaborator

@surender nath reddy kudumula - There is a JIRA actively being worked on to add Hive JDBC support to Nifi. https://issues.apache.org/jira/browse/NIFI-981

avatar
New Member

Neeraj, thanks for your advice.