Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What is the issue with flume agent configuration?

Highlighted

What is the issue with flume agent configuration?

Explorer

I have configured flume agent for twitter but i am facing issues. The agent gave me the following error . I have used both cloudera source type as well as horton source in the configuration of the agent

31 Dec 2016 16:11:28,444 ERROR [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:142)  - Failed to load configuration data. Exception follows.
org.apache.flume.FlumeException: Unable to load source type: poc.hortonworks.flume.source.twitter.TwitterSource, class: poc.hortonworks.flume.source.twitter.TwitterSource
# Twitter streaming#####################################
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
#I tried both type. and same error for both. 
TwitterAgent.sources.Twitter.type = poc.hortonworks.flume.source.twitter.TwitterSource
#com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel 
TwitterAgent.sources.Twitter.consumerKey = lojYSyG00i19PKUfJhHW9zbOu
TwitterAgent.sources.Twitter.consumerSecret = yo2hIAavEPPRbmUv9ec30P1UFrl9dQxXDrYhH97AU6GmIHcnGf
TwitterAgent.sources.Twitter.accessToken = 293638770-sGR8FWTJfIAvLo8y3VjuiFyHdZLJlD1dtytHx2
TwitterAgent.sources.Twitter.accessTokenSecret = 39SlNIVR3NMOlNSqOKRHhVo7zD6rWvJf2zIJtrMN2P5ET
TwitterAgent.sources.Twitter.keywords = finland,helsinki


TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path =hdfs://hdp.localdomain:8020/data/input
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 10
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000


TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100

I have configured the flume_classpath also with the jar file flume-sources-1.0-SNAPSHOT.jar which is there in the folder

export FLUME_CLASSPATH=$FLUME_CLASSPATH:/usr/lib/flume/lib/flume-sources-1.0-SNAPSHOT.jar

One more issue, in the flume-env.sh file of the TwitterAgent, i set the path somthing like

export FLUME_CLASSPATH=$FLUME_CLASSPATH:/usr/hdp/2.3.6.0-3796/flume/lib/flume-sources-1.0-SNAPSHOT.jar

However, the path gets deleted once i refresh the agent. I dont know why that happens. In the env file there is one more like

if [ -e "/usr/lib/flume/lib/ambari-metrics-flume-sink.jar" ]; then
  export FLUME_CLASSPATH=$FLUME_CLASSPATH:/usr/lib/flume/lib/ambari-metrics-flume-sink.jar
fi

I even changed this path (/usr/lib/flume/lib)to somthing like ( /usr/hdp/2.3.6.0-3796/flume/lib/) but to no use. The path reverts back to the previous one on refreshing the agent.

Now i feel its possibly the path issue.

3 REPLIES 3

Re: What is the issue with flume agent configuration?

Expert Contributor

Looks like it unable to source the config file here. Ensure that the configuration file is getting sourced out right and has proper permissions. That post does not have details for it. But a sample agent startup would be like below:

$ bin/flume-ng agent --conf ./conf/ -f conf/twitter.conf 
Dflume.root.logger=DEBUG,console -n TwitterAgent

Re: What is the issue with flume agent configuration?

Explorer

@Sumesh The configuration file is managed through ambari, so there shouldnt be any issue with the rights. Here is all the agents list. Just to clear the assumption i provided 777 right on all the files of the agent TwitterAgent

drwxr-xr-x 9 flume root 4096 Dec 30 16:35 ./
drwxr-xr-x 3 root  root 4096 Nov 23 16:57 ../
drwxr-xr-x 2 flume root 4096 Dec 31 16:49 a1/
drwxr-xr-x 2 flume root 4096 Dec 30 15:27 a11/
drwxr-xr-x 2 flume root 4096 Dec 27 15:05 ant/
drwxr-xr-x 2 root  root 4096 Nov 23 16:57 conf.empty/
drwxr-xr-x 2 flume root 4096 Dec 31 16:49 FMIagent/
drwxr-xr-x 2 flume root 4096 Dec 31 16:49 test/
drwxr-xr-x 2 flume root 4096 Dec 31 16:49 TwitterAgent/				

You can see all the have the same level of rights. Next i assigned all 777 to all the files in the Twitteragent

drwxrwxrwx 2 flume root 4096 Dec 31 16:49 ./
drwxr-xr-x 9 flume root 4096 Dec 30 16:35 ../
-rwxrwxrwx 1 flume root   59 Dec 31 16:49 ambari-meta.json*
-rwxrwxrwx 1 flume root 1238 Dec 31 16:49 flume.conf*
-rwxrwxrwx 1 flume root 1475 Dec 31 16:49 flume-env.sh*
-rwxrwxrwx 1 flume root  237 Dec 30 16:35 flume-metrics2.properties*
-rwxrwxrwx 1 flume root 3083 Dec 30 16:35 log4j.properties*

Then i tried to refresh the conf in ambari and check if i am able to get any tweets. However, same issue encountered. I dont know what could be the issue with the configuration.

02 Jan 2017 09:52:35,473 ERROR [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:142)  - Failed to load configuration data. Exce
ption follows.
org.apache.flume.FlumeException: Unable to load source type: poc.hortonworks.flume.source.twitter.TwitterSource, class: poc.hortonworks.flume.source.twitter.TwitterSource

Re: What is the issue with flume agent configuration?

Explorer

So finally i resolved the issue. The problem was with the .jar file and source type. I used this link http://blog.hubacek.uk/streaming-tweets-into-hadoop-part-ii/ to download the jar file. When i did not care much about the flume_classpath this time and also it placed the jar in and instead of hdp source i used the cloudera source for the agent config.

TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource

Now it works perfect enuf to get data into hdfs.

Don't have an account?
Coming from Hortonworks? Activate your account here