Reply
Highlighted
Expert Contributor
Posts: 73
Registered: ‎11-24-2017
Accepted Solution

Flume custom source

[ Edited ]

Hi everyone, I am trying to use a custom source in Flume but when I start the agent I get a 

org.apache.flume.FlumeException: Unable to load source type: com.cloudera.flume.source.MySource, class: com.cloudera.flume.source.MySource

 

The custom source is specified in the MySource.java file, the package is com.cloudera.flume.source.

This is what I did:

 

  1. Compile the java file passing the class path to flume and hadoop libraries (this generates the MySource.class file):
    javac -cp /opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop/*:/opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop-mapreduce/*:/opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/flume-ng/lib/* MySource.java -Xlint
  2. Create manifest.mf file like the following:
    Manifest-Version: 1.0
    Main-Class: com.cloudera.flume.source.MySource
  3. Generate the MySource.jar file:
    jar cvfm MySource.jar manifest.mf MySource.class
  4. Move the MySource.jar file in the flume library folder:
    sudo mv MySource.jar /opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/flume-ng/lib
  5. The custom flume configuration file is the following:
    # custom.conf
    
    # Naming the components on the current agent. 
    MyAgent.sources = MySource 
    MyAgent.channels = MemChannel 
    MyAgent.sinks = HDFS
      
    # Describing/Configuring the source 
    MyAgent.sources.MySource.type = com.cloudera.flume.source.MySource
      
    # Describing/Configuring the sink 
    MyAgent.sinks.HDFS.type = hdfs 
    MyAgent.sinks.HDFS.hdfs.path = /test/flume/mysource-logs
    MyAgent.sinks.HDFS.hdfs.fileType = DataStream 
    MyAgent.sinks.HDFS.hdfs.writeFormat = Text 
    MyAgent.sinks.HDFS.hdfs.batchSize = 1000
    MyAgent.sinks.HDFS.hdfs.rollSize = 0 
    MyAgent.sinks.HDFS.hdfs.rollCount = 10000 
     
    # Describing/Configuring the channel 
    MyAgent.channels.MemChannel.type = memory 
    MyAgent.channels.MemChannel.capacity = 10000 
    MyAgent.channels.MemChannel.transactionCapacity = 100
      
    # Binding the source and sink to the channel 
    MyAgent.sources.MySource.channels = MemChannel
    MyAgent.sinks.HDFS.channel = MemChannel 
  6. Then start the agent with the following command:
    flume-ng agent \
    --conf /etc/flume-ng/conf \
    --conf-file custom.conf \
    --name MyAgent \
    -Dflume.root.logger=INFO,console

At this point I get a org.apache.flume.FlumeException, it seems it cannot find  

com.cloudera.flume.source.MySource

 

From the library paths included when started the agent I can see the path /opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/flume-ng/lib where I copied MySouce.jar file, thus I don't understand why it cannot find the class.

What am I doing wrong?

 

ps: I am using CDH 5.13 installed by Cloudera Manager.

 

New Contributor
Posts: 1
Registered: ‎05-21-2018

Re: Flume custom source

I feel this is something to do with the way you have created the jar .Do you see the jar when extracted having the same package as shown in the error message?

Also, try adding the full path for the class in the flume config.

 

 

Posts: 1,760
Kudos: 378
Solutions: 282
Registered: ‎07-31-2013

Re: Flume custom source

@Smitha is right here. The below step specifically is incorrect.

> jar cvfm MySource.jar manifest.mf MySource.class

Your class is within a package (com.cloudera.flume.source) but the jar is loading them into the top level package. The ideal way would be to do this:

~> mkdir -p com/cloudera/flume/source/
~> mv MySource.class com/cloudera/flume/source/
~> jar cvf MySource.jar com/cloudera/flume/source/MySource.class

Doing the above steps within your sequence would ensure the class gets placed in the declared package instead of at the top level.

More generally, you can avoid these forms of trivial packaging mistakes by using a formal build tool/system such as Maven, or even IDEs such as IntelliJ or Eclipse which allow archive building from source projects. These package jars for you in the required form, maintaining namespaces perfectly among several other benefits.
Expert Contributor
Posts: 73
Registered: ‎11-24-2017

Re: Flume custom source

Thanks, I indeed end up using Maven and plugins.d folder on Flume. Forgot to update the topic, thank you guys for the help!

Announcements
New solutions