Reply
Highlighted
Dr
Explorer
Posts: 17
Registered: ‎08-14-2013

Flume avro event serializer not producing avro file.

Hi Guys,

I've built the avro serializer

https://github.com/cloudera/cdk/blob/master/cdk-flume-avro-event-serializer/src/main/java/org/apache...

and installed it in the plugins directory of flume. I've update my agent config with the serializer pointing to AvroEventSerializer$Builder.

When I send in my events I'm setting the schema in the header (literal string for now), the body is json. It goes through to hdfs but the body is just plain text with no errors. I was expecting an avro file?

Am I doing anything wrong?

Do you have an example agent config?

 

Thanks

 

Andrew

Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: Flume avro event serializer not producing avro file.

Does setting sink serializer to avro_event generate Json?

 

agent.sinks.svc_0_sink.type = hdfs
agent.sinks.svc_0_sink.hdfs.fileType = DataStream
agent.sinks.svc_0_sink.serializer = avro_event
agent.sinks.svc_0_sink.serializer.compressionCodec = snappy

Dr
Explorer
Posts: 17
Registered: ‎08-14-2013

Re: Flume avro event serializer not producing avro file.

I've moved cdk-flume-avro-event-serializer-0.5.1-SNAPSHOT.jar into the lib directory of flume-ng. Now I get errors

 

org.apache.flume.FlumeException: Unable to instantiate Builder from org.apache.flume.serialization.AvroEventSerializer: does not appear to implement org.apache.flume.serialization.EventSerializer$Builder

 

My agent config is:

 

## Write to HDFS
collector.sinks.HadoopOut.type = hdfs
collector.sinks.HadoopOut.hdfs.path = /flume/
collector.sinks.HadoopOut.hdfs.fileType = DataStream
collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 5000
collector.sinks.HadoopOut.hdfs.rollInterval = 5
collector.sinks.HadoopOut.hdfs.batchSize = 1000
collector.sinks.HadoopOut.serializer = org.apache.flume.serialization.AvroEventSerializer
collector.sinks.HadoopOut.serializer.compressionCodec = snappy

Dr
Explorer
Posts: 17
Registered: ‎08-14-2013

Re: Flume avro event serializer not producing avro file.

I've fixed it. I forgot the $Bulider in the agent config.

 

Works like a charm.

 

Thanks

 

Andrew

Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: Flume avro event serializer not producing avro file.

Correction:
Does setting sink serializer to avro_event generate Avro?
Dr
Explorer
Posts: 17
Registered: ‎08-14-2013

Re: Flume avro event serializer not producing avro file.

Yes but i'm using this class so my client apps can fire in custom schemas in the headers and have flume serialize the json in the body.

if i set avro_event the schema is the default header and body of the flume event.

The class now creates an avro via flume but when when i use avro tools (tojson) or hive to look at it i now get an indexoutofbounds error?

i assume when using this clsss the body of the flume event should be json?
Dr
Explorer
Posts: 17
Registered: ‎08-14-2013

Re: Flume avro event serializer not producing avro file.

Can anyone help with this? I think I'm I just not setting the body of the flume event correctly?

 

Getting this to work would mean a wider adoption of Hadoop in my company.

Dr
Explorer
Posts: 17
Registered: ‎08-14-2013

Re: Flume avro event serializer not producing avro file.

I spoke too soon!

 

The answer was in the Test class and the comments. The body is the avro datum binary.

 

Event event = EventBuilder.withBody(serializeAvro(record, schema));

 

private byte[] serializeAvro(Object datum, Schema schema) throws IOException {

    ByteArrayOutputStream out = new ByteArrayOutputStream();

    ReflectDatumWriter<Object> writer = new ReflectDatumWriter<Object>(schema);

    BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(out, null);

    out.reset();

    writer.write(datum, encoder);

    encoder.flush();

    return out.toByteArray();

  }

New Contributor
Posts: 3
Registered: ‎12-10-2018

Re: Flume avro event serializer not producing avro file.

Dr. I might be facing same issue and unfortunately I dont see whats real issue given that when I try to read file using the validateAvroFile method it fails for me on console too. Could you pls point out the real issue and guide me in right direction.

 

Your help would mean a lot to me. Hoping for a response.

 

Thanks