Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can you please tell me how to configure flume classpath and make the Twitter problem work?

Highlighted

Can you please tell me how to configure flume classpath and make the Twitter problem work?

Hi, I tried the basic flume assignment by creating a text file which eventually is copied to HDFS. But now when I try to work on the Twitter example, I am facing problems and I think its because of these points:

1. I have used the cloudera provided jar file to copy Twitter data. This should not make a difference?

2. I have placed the jar file in /usr/lib/flume/lib/ directory. Is this the correct place to put the jar?

3. In FLUME_CLASSPATH, I have give the above library path.

4. I am starting the flume agent using following command:

$ bin/flume-ng agent -n $TwitterAgent -c conf -f conf/flume-conf

I have no clue where I am going wrong....or may be i am missing the basics somewhere. It would be good if someone can provide some guidance. Thanks a lot.

9 REPLIES 9
Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Mentor

if this is Cloudera specific, we can't help much as configurations are different across both platforms. If I may recommend looking at Apache nifi and save you tons of grief. https://blogs.apache.org/nifi/entry/indexing_tweets_with_nifi_and

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Mentor

Here's a fancier version of that https://duckduckgo.com/?q=simple+nifi+banana+twitter

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Hi,

Thanks for your prompt response. Actually I am using the Hortonworks Sandbox, but I was just using the Cloudera provided JAR which copies the twitter data to HDFS as source. If you can advise where I can get the jar which is compatible with Hortonworks / if Hortonworks has any such JAR, that would help. I was aware of nifi but I want to do this assignment using Flume.

Thanks.

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Mentor

I'm not sure I understand, Twitter firehose source is available in Flume. The only thing I'm not sure about is whether it is available in our version. It was released as part of flume 1.6, we are on 1.5.2. A lot of the features were backported to our version but I can't say for sure about this source. There is no jar, you just plugin your Twitter application credentials and you are good to go.

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Mentor

I think I understand now, here's an example http://www.thecloudavenue.com/2013/03/analyse-tweets-using-flume-hadoop-and.html

It looks like you need to use their version of flume an one rely on Ambari managed Flume. You can use their jar and start it from location you set it to. Do not confuse their jar with our Flume version. I would put their jar in home directory or /user/local then start that agent pointing to the custom location you chose

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

I will try this way and let you know if it worked. Thanks again for your help.

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Expert Contributor

Hi,

What's the error you got?

I doubt is smth cloudera specific. Flume is a very simple tool in terms of installation and configuration.

To adjust classpath per agent you can add "--classpath" argument to the command.

From my experience, if you need to use flume with twitter is better to remove everything related to twitter libs from the flume default classpath. Just to avoid issues with twitter4j dependencies.

Have no idea why they included it to installation by default.

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Contributor

I have a 4 node cluster with HDP 2.3.4.0 and a compatible ambari agent.I have flume running on 4th node and name node on 1st node, Now where should i run the flume agent and should i configure the flume.conf on the machine or on the ambari UI?? I'm totally confused with this multi node cluster and ambari. Can you please guide me to successfully flume the twitter data.Any useful link may help.

Thanks in advance.

Highlighted

Re: Can you please tell me how to configure flume classpath and make the Twitter problem work?

Explorer

Hello,

Settings flume agents must do it in Ambari UI. I made the capture of twits with flume and has worked for me in the following link: http://blog.hubacek.uk/streaming-tweets-into-hadoop-part-ii/

Do not forget that the jar that downloads cloudera should be in the path of your lib flume, for example in centos (usr/HDP/current/flume/ lib).

A step followed you set up agents in ambari UI, restart flume from the ambari and should begin to capture you twits.

If you have any error you can check the log flume, for example in centos (var/log/flume).

Greetings and good luck

Don't have an account?
Coming from Hortonworks? Activate your account here