Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Does anyone know how can we stream Twitter data of specific Twitter accounts using Apache Flume ?

avatar
New Member
 
1 ACCEPTED SOLUTION

avatar
Super Collaborator

@azza messaoudi, check the following Twitter doc: https://dev.twitter.com/streaming/reference/post/statuses/filter

And here is the custom Flume source implementation with support of all twitter streaming parameters: http://www.dataprocessingtips.com/2016/04/24/custom-twitter-source-for-apache-flume/

(including "follow" parameter which you're interested in actually)

View solution in original post

2 REPLIES 2

avatar

Hi @azza messaoudi sorry this isn't a direct answer to your question, but unless you REALLY want to use Flume for this, have you looked at NiFi?

There is already a GetTwitter processor that you can configure for search terms etc, and there's a number of really good demos to show how you can put all this together.

A 3 part demo series here:

https://www.linkedin.com/pulse/apache-nifi-part-1-introduction-neeraj-sabharwal

... and a full tutorial including indexing in solr here: http://hortonworks.com/hadoop-tutorial/how-to-refine-and-visualize-sentiment-data/

I know that's not exactly what you asked for, but I promise that it's absolutely the easiest way to get twitter information into Hadoop that I've ever used.

Hope that helps.

avatar
Super Collaborator

@azza messaoudi, check the following Twitter doc: https://dev.twitter.com/streaming/reference/post/statuses/filter

And here is the custom Flume source implementation with support of all twitter streaming parameters: http://www.dataprocessingtips.com/2016/04/24/custom-twitter-source-for-apache-flume/

(including "follow" parameter which you're interested in actually)