Support Questions
Find answers, ask questions, and share your expertise

Unable to download live tweets into hdfs

Unable to download live tweets into hdfs

Explorer

Hi All,

I want to download live tweets from twitter.com into hdfs.

i have followed all the steps mentioned in the link

https://www.eduonix.com/blog/bigdata-and-hadoop/flume-installation-and-streaming-twitter-data-using-...

In flume.conf i need few clarification like

as mentioned in flume.conf file

TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource

TwitterAgent.sinks.HDFS.hdfs.path=hdfs://localhost:9000/user/flume/tweets

I am using hdp 2.4 sandbox . So what will be values for

TwitterAgent.sources.Twitter.type =

TwitterAgent.sinks.HDFS.hdfs.path=

Please help me on this.

Regards

Abhishek

3 REPLIES 3
Highlighted

Re: Unable to download live tweets into hdfs

Mentor

@Abhishek Srivastava

The HDFS sink URL should be the fs.defaultFS URL which is 8020 not the port 9000 can you explain where you got that ?

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://FQDN:8020/user/flume/tweets


Also make sure the /user/flume/tweets is writable to the user !!!

Hope that helps, it could have been of much help if you copied your entire flume.conf of course not with your consumerKey and consumerSecret.

Highlighted

Re: Unable to download live tweets into hdfs

Explorer

@Geoffrey Shelton Okot

Hi ,

You mentioned to use Twitter.Type as org.apache.flume.source.twitter.TwitterSource

But as I told I you I am using hortonworks sandbox hdp 2.4 . Will it work for it as well ?

Also what does FQDN stands for ?

Please explain

Regards

Abhishek

Highlighted

Re: Unable to download live tweets into hdfs

Explorer

@@Geoffrey Shelton Okot

Hi ,

I am unable to do download the live tweets .

My process stops at the following steps.

PFB the logs

[root@sandbox apache-flume-1.4.0-bin]# ./bin/flume-ng agent -n TwitterAgent -c conf -f /usr/lib/apache-flume-1.4.0-bin/conf/flume.conf

Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access

Info: Excluding /usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-api-1.7.10.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/tez/lib/slf4j-api-1.7.5.jar from classpath

Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access

Info: Excluding /usr/hdp/2.4.0.0-169/hbase/lib/slf4j-api-1.7.7.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-api-1.7.10.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/tez/lib/slf4j-api-1.7.5.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-api-1.7.10.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-api-1.6.1.jar from classpath

Info: Excluding /usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar from classpath

+ exec /usr/lib/jvm/java/bin/java -Xmx20m -cp '/usr/lib/apache-flume-1.4.0-bin/conf:/usr/lib/apache-flume-1.4.0-bin/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/activation-1.1.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/apacheds-i18n-2.0.0-M15.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/api-asn1-api-1.0.0-M20.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/api-util-1.0.0-M20.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/asm-3.2.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/avro-1.7.4.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/aws-java-sdk-1.7.4.jar:/usr/hdp/2.4.0.0-169/hadoop/lib/azure-storage-2.2.0.jar:/usr/hdp/2.4.0.0-

.

.

.

Regards

Abhishek