Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Flume Twitter - Could not find any valid channels. Marking it as invalid. No configuration found for this host:TwitterAgent

avatar
Expert Contributor

Hi I am trying to fetch the data from twitter to my hdfs and while running my flume-ng agent, i am getting below logs.

could you please assist me

16/12/24 00:06:22 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting 16/12/24 00:06:22 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/etc/flume-ng/conf/flume.conf 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.path 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.path = hdfs://master:8020/user/cloudera/tweets/ 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: channel.Memchannel.capacity 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.channel.Memchannel.capacity = 10000 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.channel 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.channel = Memchannel 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.writeFormat 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.writeFormat = Text 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: channel.Memchannel.type 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.channel.Memchannel.type = memory 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.rollCount 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.rollCount = 10000 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: channel.Memchannel.transactionalCapacity 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.channel.Memchannel.transactionalCapacity = 100 16/12/24 00:06:22 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.batchsize 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.batchsize = 1000 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.rollSize 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.rollSize = 0 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.filetype 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.filetype = DataStream 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.type 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.type = hdfs 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Agent configuration for 'TwitterAgent' does not contain any valid channels. Marking it as invalid. 16/12/24 00:06:22 WARN conf.FlumeConfiguration: Agent configuration invalid for agent 'TwitterAgent'. It will be removed. 16/12/24 00:06:22 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [] 16/12/24 00:06:22 WARN node.AbstractConfigurationProvider: No configuration found for this host:TwitterAgent 16/12/24 00:06:22 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }

1 ACCEPTED SOLUTION

avatar
Rising Star

@Praveen PentaReddyMost likely in your twitter.conf you have "TwitterAgent.sink.HDFS.channels" instead of "sinks". There will be multiple properties and they should all be TwitterAgent.sinks not "sink".

If that doesn't work you may want to post your twitter.conf and flume.env.sh so we can help narrow it down.

I would also strongly consider looking at using NiFi for handling the movement of data from Twitter to HDFS, here is a detailed tutorial on doing that: https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.h...

View solution in original post

7 REPLIES 7

avatar
Rising Star

@Praveen PentaReddyMost likely in your twitter.conf you have "TwitterAgent.sink.HDFS.channels" instead of "sinks". There will be multiple properties and they should all be TwitterAgent.sinks not "sink".

If that doesn't work you may want to post your twitter.conf and flume.env.sh so we can help narrow it down.

I would also strongly consider looking at using NiFi for handling the movement of data from Twitter to HDFS, here is a detailed tutorial on doing that: https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.h...

avatar
Explorer

As said by Devin, in your flume.conf file you must have below incorrectly specified properties.

TwitterAgent.sink.HDFS.hdfs.path

TwitterAgent.channel.Memchannel.capacity

TwitterAgent.sink.HDFS.channel

TwitterAgent.sink.HDFS.hdfs.writeFormat

TwitterAgent.channel.Memchannel.type

TwitterAgent.sink.HDFS.hdfs.rollCount

TwitterAgent.channel.Memchannel.transactionalCapacity

TwitterAgent.sink.HDFS.hdfs.batchsize

TwitterAgent.sink.HDFS.hdfs.rollSize

TwitterAgent.sink.HDFS.hdfs.filetype

TwitterAgent.sink.HDFS.type

In each of above configuration you need to use TwitterAgent.sinks.HDFS.... and TwitterAgent.channels.Memchannel... etc. Since property syntax in wrong , flume is ignoring them and no channel was finally configured for TwitterAgent agent. Hence this agent was marked as incorrect and hence flume is unable to fetch data.

avatar
Expert Contributor

Thanks after making correction in the conf file below is what i got it. However, i could not see the data into Hdfs

16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:34 INFO conf.FlumeConfiguration: Processing:HDFS 16/12/24 11:48:35 WARN conf.FlumeConfiguration: Could not configure source Twitter due to: No Channels configured for Twitter org.apache.flume.conf.ConfigurationException: No Channels configured for Twitter at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:574) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:346) at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:213) at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:127) at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:109) at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:189) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:89) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/12/24 11:48:35 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent] 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Creating channels 16/12/24 11:48:35 INFO channel.DefaultChannelFactory: Creating instance of channel Memchannel type memory 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Created channel Memchannel 16/12/24 11:48:35 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs 16/12/24 11:48:35 INFO node.AbstractConfigurationProvider: Channel Memchannel connected to [HDFS] 16/12/24 11:48:35 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1f8dbaa6 counterGroup:{ name:null counters:{} } }} channels:{Memchannel=org.apache.flume.channel.MemoryChannel{name: Memchannel}} } 16/12/24 11:48:35 INFO node.Application: Starting Channel Memchannel 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: Memchannel: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: Memchannel started 16/12/24 11:48:36 INFO node.Application: Starting Sink HDFS 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean. 16/12/24 11:48:36 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started

avatar
Expert Contributor

Any suggestions?

avatar
Rising Star

@Praveen PentaReddy Can you please mark an answer as accepted, and then create a new question with your new problem? This way it is indexed and other people with a similar new problem will be able to benefit? Please post your Flume conf file as well, i believe you might have a bind property off in it which is why the sink is not working.

Thanks.

avatar

@Praveen PentaReddy

agent.sources = Twitter agent.channels = MemChannel agent.sinks = HDFS

agent.sources.Twitter.type = com.orienit.kalyan.flume.source.KalyanTwitterSource agent.sources.Twitter.channels = MemChannel agent.sources.Twitter.consumerKey = xxxx agent.sources.Twitter.consumerSecret = xxx agent.sources.Twitter.accessToken = xxxx agent.sources.Twitter.accessTokenSecret = xxxx agent.sources.Twitter.keywords = hadoop,spark,kafka,flume,spark steaming,NIFI,Bigdata,hortonworks,oozie,sqoop,hive,mapreduce,pig,scala

agent.sinks.HDFS.type = hdfs agent.sinks.HDFS.channel = MemChannel agent.sinks.HDFS.hdfs.path =/flume/tweets/%y/%m/%d/%H/%M agent.sinks.HDFS.hdfs.fileType = DataStream agent.sinks.HDFS.hdfs.writeFormat = Text agent.sinks.HDFS.hdfs.batchSize = 100 agent.sinks.HDFS.hdfs.rollSize = 0 agent.sinks.HDFS.hdfs.rollCount = 100 agent.sinks.HDFS.hdfs.useLocalTimeStamp = true

agent.channels.MemChannel.type = memory agent.channels.MemChannel.capacity = 1000 agent.channels.MemChannel.transactionCapacity = 100

You can try this configuration Praveen

avatar
Expert Contributor

what is the difference between both of the configuration which i posted and the one which you gave me?