Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Flume not able to get the data from twitter

Highlighted

Flume not able to get the data from twitter

Rising Star

I have configured all the required configuration and the registered entery in the configuration, However, when i run the flume agent and i am getting below logs, where i couldn't able to see the data in my hdfs.

Attached is my configuration file.flumeconf.png

16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:35 WARN conf.FlumeConfiguration: Could not configure source Twitter due to: No Channels configured for Twitter org.apache.flume.conf.ConfigurationException: No Channels configured for Twitter at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:574) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:346) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:213) at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:127) at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:109) at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:189) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:89) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/12/24 11:48:35 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent] 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Creating channels 16/12/24 11:48:35 INFO channel.DefaultChannelFactory: Creating instance of channel Memchannel type memory 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Created channel Memchannel 16/12/24 11:48:35 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Channel Memchannel connected to [HDFS] 16/12/24 11:48:35 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1f8dbaa6 counterGroup:{ name:null counters:{} } }} channels:{Memchannel=org.apache.flume.channel.MemoryChannel{name: Memchannel}} } 16/12/24 11:48:35 INFO node.Application: Starting Channel Memchannel 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: Memchannel: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: Memchannel started 16/12/24 11:48:36 INFO node.Application: Starting Sink HDFS 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started

11 REPLIES 11
Highlighted

Re: Flume not able to get the data from twitter

@Praveen PentaReddy

From the log it seems that there is issue with channels - "Could not configure source Twitter due to: No Channels configured for Twitter"

I will suggest to revisit your agent files. Please try below link -

http://vijayanyayapathi.com/apache-flume-load-data-from-twitter/

https://www.tutorialspoint.com/apache_flume/fetching_twitter_data.htm

Highlighted

Re: Flume not able to get the data from twitter

Rising Star

I have configured the flume conf file exactly what is said in the document and when i run the flume agent and below is the log

Can recommendations ?

:03:10 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting 16/12/25 12:03:10 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/etc/flume-ng/conf/flume.conf 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/25 12:03:10 WARN conf.FlumeConfiguration: Agent configuration for 'TwitterAgent' does not contain any valid channels. Marking it as invalid. 16/12/25 12:03:10 WARN conf.FlumeConfiguration: Agent configuration invalid for agent 'TwitterAgent'. It will be removed. 16/12/25 12:03:10 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [] 16/12/25 12:03:10 WARN node.AbstractConfigurationProvider: No configuration found for this host:TwitterAgent 16/12/25 12:03:10 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }

Highlighted

Re: Flume not able to get the data from twitter

Rising Star

Any recommendations? Attached is my conf fie flumeconf.png

Highlighted

Re: Flume not able to get the data from twitter

Contributor

@Praveen PentaReddy

In your config please adjust this property: TwitterAgent.sinks.HDFS.channels to TwitterAgent.sinks.HDFS.channel.

Note you have a plural channels when it should be channel.

Re: Flume not able to get the data from twitter

Rising Star

flumeconf.pngAfter making the appropriate changes i still getting the beow error. Even. after downloading the required jar file flume-sources-1.0-SNAPSHOT.jar - I see the TwitterSource class as well in this jar. not sure why i am getting this error.

I have also attached the conf file.

16/12/26 11:25:49 ERROR lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows. java.lang.IllegalStateException: Authentication credentials are missing. See http://twitter4j.org/en/configuration.html for details at twitter4j.TwitterBaseImpl.ensureAuthorizationEnabled(TwitterBaseImpl.java:215) at twitter4j.TwitterStreamImpl.filter(TwitterStreamImpl.java:317) at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:150) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)flumeconf.png

Highlighted

Re: Flume not able to get the data from twitter

@Praveen PentaReddy

How did you install Twitter library?

I see a message "com.cloudera.flume.source" in this error you posted. Are you using CDH or HDP?

Read this to learn how this is to be installed on HDP:

https://community.hortonworks.com/content/supportkb/49166/conflicting-flume-documentation-regarding-...

Highlighted

Re: Flume not able to get the data from twitter

Rising Star

I am using Cloudera Distribution and i just copied the jar file and placed and added to the classpath.

Highlighted

Re: Flume not able to get the data from twitter

Contributor

@Praveen PentaReddy

Assuming your running this with something like: flume-ng agent -n TwitterAgent -c conf -f flume.conf /usr/lib/flume-ng/conf/flume-conf.properties.

It appears your keys are incorrect. You may want to double check your tokens etc and retry, looks like just a typo or something with your twitter credentials. I would also recommend using the real Flume jars if you can:

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
Highlighted

Re: Flume not able to get the data from twitter

Rising Star

If i specify below class,then should i need to download any jar files ?

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource

Don't have an account?
Coming from Hortonworks? Activate your account here