Support Questions

Find answers, ask questions, and share your expertise
Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

Does anyone know how can we stream Twitter data of specific Twitter accounts using Apache Flume ?

avatar
New Member
 
1 ACCEPTED SOLUTION

avatar
Super Collaborator

@azza messaoudi, check the following Twitter doc: https://dev.twitter.com/streaming/reference/post/statuses/filter

And here is the custom Flume source implementation with support of all twitter streaming parameters: http://www.dataprocessingtips.com/2016/04/24/custom-twitter-source-for-apache-flume/

(including "follow" parameter which you're interested in actually)

View solution in original post

2 REPLIES 2

avatar

Hi @azza messaoudi sorry this isn't a direct answer to your question, but unless you REALLY want to use Flume for this, have you looked at NiFi?

There is already a GetTwitter processor that you can configure for search terms etc, and there's a number of really good demos to show how you can put all this together.

A 3 part demo series here:

https://www.linkedin.com/pulse/apache-nifi-part-1-introduction-neeraj-sabharwal

... and a full tutorial including indexing in solr here: http://hortonworks.com/hadoop-tutorial/how-to-refine-and-visualize-sentiment-data/

I know that's not exactly what you asked for, but I promise that it's absolutely the easiest way to get twitter information into Hadoop that I've ever used.

Hope that helps.

avatar
Super Collaborator

@azza messaoudi, check the following Twitter doc: https://dev.twitter.com/streaming/reference/post/statuses/filter

And here is the custom Flume source implementation with support of all twitter streaming parameters: http://www.dataprocessingtips.com/2016/04/24/custom-twitter-source-for-apache-flume/

(including "follow" parameter which you're interested in actually)