Created 08-14-2013 01:32 PM
Hi Guys,
I've built the avro serializer
https://github.com/cloudera/cdk/blob/master/cdk-flume-avro-event-serializer/src/main/java/org/apache...
and installed it in the plugins directory of flume. I've update my agent config with the serializer pointing to AvroEventSerializer$Builder.
When I send in my events I'm setting the schema in the header (literal string for now), the body is json. It goes through to hdfs but the body is just plain text with no errors. I was expecting an avro file?
Am I doing anything wrong?
Do you have an example agent config?
Thanks
Andrew
Created 08-14-2013 04:39 PM
Does setting sink serializer to avro_event generate Json?
agent.sinks.svc_0_sink.type = hdfs
agent.sinks.svc_0_sink.hdfs.fileType = DataStream
agent.sinks.svc_0_sink.serializer = avro_event
agent.sinks.svc_0_sink.serializer.compressionCodec = snappy
Created 08-15-2013 01:52 AM
I've moved cdk-flume-avro-event-serializer-0.5.1-SNAPSHOT.jar into the lib directory of flume-ng. Now I get errors
org.apache.flume.FlumeException: Unable to instantiate Builder from org.apache.flume.serialization.AvroEventSerializer: does not appear to implement org.apache.flume.serialization.EventSerializer$Builder
My agent config is:
## Write to HDFS
collector.sinks.HadoopOut.type = hdfs
collector.sinks.HadoopOut.hdfs.path = /flume/
collector.sinks.HadoopOut.hdfs.fileType = DataStream
collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 5000
collector.sinks.HadoopOut.hdfs.rollInterval = 5
collector.sinks.HadoopOut.hdfs.batchSize = 1000
collector.sinks.HadoopOut.serializer = org.apache.flume.serialization.AvroEventSerializer
collector.sinks.HadoopOut.serializer.compressionCodec = snappy
Created 08-15-2013 03:19 AM
I've fixed it. I forgot the $Bulider in the agent config.
Works like a charm.
Thanks
Andrew
Created 08-15-2013 06:12 AM
Created 08-15-2013 09:38 AM
Created 08-16-2013 01:08 PM
Can anyone help with this? I think I'm I just not setting the body of the flume event correctly?
Getting this to work would mean a wider adoption of Hadoop in my company.
Created 08-16-2013 01:57 PM
I spoke too soon!
The answer was in the Test class and the comments. The body is the avro datum binary.
Event event = EventBuilder.withBody(serializeAvro(record, schema));
private byte[] serializeAvro(Object datum, Schema schema) throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
ReflectDatumWriter<Object> writer = new ReflectDatumWriter<Object>(schema);
BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(out, null);
out.reset();
writer.write(datum, encoder);
encoder.flush();
return out.toByteArray();
}
Created 12-10-2018 01:26 PM
Dr. I might be facing same issue and unfortunately I dont see whats real issue given that when I try to read file using the validateAvroFile method it fails for me on console too. Could you pls point out the real issue and guide me in right direction.
Your help would mean a lot to me. Hoping for a response.
Thanks