I have an avro input source, going through a morphline into Solr. For example the following structure:
{
"username" : "alex"
"date" : "21-08-2014"
"attachments" : [
"documents" : [
{
"title": "test"
"tags" : [ "a", "b", "c" ]
},
{
"optional1" : "test2"
"title" : "test2"
} ],
"source" : "school"
]
}
I can extract with extractAvroPath, like so:
...
{ extractAvroPaths {
flatten : true
paths : {
/my_user : /username # this works fine
/my_attachments : "/attachments[]"
/my_documents : "/attachments[]/documents[]"
}
}
}
.....
The problem being that /my_attachments or /my_documents now contain raw json/avro structures instead of a single field. How would I go about 'unwrapping' these fields so that they are all part of one solr document, while still retaining their context of the document they belong to?