Support Questions
Find answers, ask questions, and share your expertise

Flume not able to get the data from twitter

Rising Star

I have configured all the required configuration and the registered entery in the configuration, However, when i run the flume agent and i am getting below logs, where i couldn't able to see the data in my hdfs.

Attached is my configuration file.flumeconf.png

16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:35 WARN conf.FlumeConfiguration: Could not configure source Twitter due to: No Channels configured for Twitter org.apache.flume.conf.ConfigurationException: No Channels configured for Twitter at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:574) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:346) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:213) at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:127) at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:109) at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:189) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:89) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/12/24 11:48:35 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent] 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Creating channels 16/12/24 11:48:35 INFO channel.DefaultChannelFactory: Creating instance of channel Memchannel type memory 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Created channel Memchannel 16/12/24 11:48:35 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Channel Memchannel connected to [HDFS] 16/12/24 11:48:35 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1f8dbaa6 counterGroup:{ name:null counters:{} } }} channels:{Memchannel=org.apache.flume.channel.MemoryChannel{name: Memchannel}} } 16/12/24 11:48:35 INFO node.Application: Starting Channel Memchannel 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: Memchannel: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: Memchannel started 16/12/24 11:48:36 INFO node.Application: Starting Sink HDFS 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started

11 REPLIES 11

@Praveen PentaReddy

From the log it seems that there is issue with channels - "Could not configure source Twitter due to: No Channels configured for Twitter"

I will suggest to revisit your agent files. Please try below link -

http://vijayanyayapathi.com/apache-flume-load-data-from-twitter/

https://www.tutorialspoint.com/apache_flume/fetching_twitter_data.htm

Rising Star

I have configured the flume conf file exactly what is said in the document and when i run the flume agent and below is the log

Can recommendations ?

:03:10 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting 16/12/25 12:03:10 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/etc/flume-ng/conf/flume.conf 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 WARN conf.FlumeConfiguration: Agent configuration for 'TwitterAgent' does not contain any valid channels. Marking it as invalid. 16/12/25 12:03:10 WARN conf.FlumeConfiguration: Agent configuration invalid for agent 'TwitterAgent'. It will be removed. 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [] 16/12/25 12:03:10 WARN node.AbstractConfigurationProvider: No configuration found for this host:TwitterAgent 16/12/25 12:03:10 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }

Rising Star

Any recommendations? Attached is my conf fie flumeconf.png

Contributor

@Praveen PentaReddy

In your config please adjust this property: TwitterAgent.sinks.HDFS.channels to TwitterAgent.sinks.HDFS.channel.

Note you have a plural channels when it should be channel.

Rising Star

flumeconf.pngAfter making the appropriate changes i still getting the beow error. Even. after downloading the required jar file flume-sources-1.0-SNAPSHOT.jar - I see the TwitterSource class as well in this jar. not sure why i am getting this error.

I have also attached the conf file.

16/12/26 11:25:49 ERROR lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows. java.lang.IllegalStateException: Authentication credentials are missing. See http://twitter4j.org/en/configuration.html for details at twitter4j.TwitterBaseImpl.ensureAuthorizationEnabled(TwitterBaseImpl.java:215) at twitter4j.TwitterStreamImpl.filter(TwitterStreamImpl.java:317) at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:150) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)flumeconf.png

@Praveen PentaReddy

How did you install Twitter library?

I see a message "com.cloudera.flume.source" in this error you posted. Are you using CDH or HDP?

Read this to learn how this is to be installed on HDP:

https://community.hortonworks.com/content/supportkb/49166/conflicting-flume-documentation-regarding-...

Rising Star

I am using Cloudera Distribution and i just copied the jar file and placed and added to the classpath.

Contributor

@Praveen PentaReddy

Assuming your running this with something like: flume-ng agent -n TwitterAgent -c conf -f flume.conf /usr/lib/flume-ng/conf/flume-conf.properties.

It appears your keys are incorrect. You may want to double check your tokens etc and retry, looks like just a typo or something with your twitter credentials. I would also recommend using the real Flume jars if you can:

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource

Rising Star

If i specify below class,then should i need to download any jar files ?

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource