Support Questions
Find answers, ask questions, and share your expertise

Flume not able to get the data from twitter

Rising Star

I have configured all the required configuration and the registered entery in the configuration, However, when i run the flume agent and i am getting below logs, where i couldn't able to see the data in my hdfs.

Attached is my configuration file.flumeconf.png

16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:35 WARN conf.FlumeConfiguration: Could not configure source Twitter due to: No Channels configured for Twitter org.apache.flume.conf.ConfigurationException: No Channels configured for Twitter at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:574) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:346) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:213) at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:127) at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:109) at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:189) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:89) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/12/24 11:48:35 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent] 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Creating channels 16/12/24 11:48:35 INFO channel.DefaultChannelFactory: Creating instance of channel Memchannel type memory 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Created channel Memchannel 16/12/24 11:48:35 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Channel Memchannel connected to [HDFS] 16/12/24 11:48:35 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1f8dbaa6 counterGroup:{ name:null counters:{} } }} channels:{Memchannel=org.apache.flume.channel.MemoryChannel{name: Memchannel}} } 16/12/24 11:48:35 INFO node.Application: Starting Channel Memchannel 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: Memchannel: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: Memchannel started 16/12/24 11:48:36 INFO node.Application: Starting Sink HDFS 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started

11 REPLIES 11

Contributor

You shouldn't have to, the tutorial posted by @Sagar Shimpi above should help run through it.

Rising Star

I have provided the required details in the configuration file, however, i am getting below after i change the package to org.apache.flume.source.twitter.TwitterSource.

16/12/26 16:08:10 ERROR node.PollingPropertiesFileConfigurationProvider: Failed to load configuration data. Exception follows. org.apache.flume.FlumeException: Unable to load source type: org.apache.source.flume.TwitterSource, class: org.apache.source.flume.TwitterSource at org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:69) at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:42) at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:322) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:97) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: org.apache.source.flume.TwitterSource at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:190) at org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:67) ... 11 more

when i change it the package to cloudera.

com.cloudera.flume.source.TwitterSource --- I am getting below error not sure what is causing the issue.

Can any one help me ? I can come on to online so that i can share my screen.

:Twitter,state:IDLE} } - Exception follows. java.lang.IllegalStateException: Authentication credentials are missing. See http://twitter4j.org/en/configuration.html for details at twitter4j.TwitterBaseImpl.ensureAuthorizationEnabled(TwitterBaseImpl.java:215) at twitter4j.TwitterStreamImpl.filter(TwitterStreamImpl.java:317) at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:150) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/12/26 16:16:59 ERROR lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows. java.lang.IllegalStateException: Authentication credentials are missing. See http://twitter4j.org/en/configuration.html for details at twitter4j.TwitterBaseImpl.ensureAuthorizationEnabled(TwitterBaseImpl.java:215) at twitter4j.TwitterStreamImpl.filter(TwitterStreamImpl.java:317) at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:150) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)