Reply
New Contributor
Posts: 2
Registered: ‎06-27-2018

Flume Interceptor + Avro Event Serializer + HDFS + Multiple JSON Messages

Hello Cloudera Community!

 

I'm new to Big Data and seeking assistance.  

 

I'm developing a Flume job to write Avro files to HDFS. 

The source into Flume is a JSON string from Kafka, I use an Interceptor to convert/encode to an Avro object, then I use the AvroEventSerializer, with a custom schema (schemaURL), to serialize and write to HDFS.

 

This works great, however, I'm running into a slight issue.

 

When I try to pass multiple JSON messages in my event body, this is not being serialized properly, and I'm unable to deserialize once on HDFS. (avro-tools tojson ....).

 

Does anyone have experience writing multiple JSON messages, in the same Flume event for serialization?

 

Ex. JSON ={'name':'Herbert', 'address':'123 fake st.'}

                   {'name':'Monika':'address':123 not fake street'}

 

I'd appreciate any input/experience you'd like to share.

 

Thanks!

 

J

 

 

Announcements