Created 07-02-2015 08:24 AM
Trying to ingest ND JSON records like this:
{
"latitude": "44.041803",
"occurrence_date": "2015-02-24 13:40:48",
"longitude": "-123.082105",
"external_device_id": "ER00E925"
},
...
{
"latitude": "44.044547",
"occurrence_date": "2015-02-24 13:41:19",
"longitude": "-123.082082",
"external_device_id": "ER00E925"
}
I'm using a spool source to read the input file, and tried to use 'readMultiLine' to turn them into this:
{ "latitude": "44.041803", "occurrence_date": "2015-02-24 13:40:48", "longitude": "-123.082105", "external_device_id": "ER00E925"},
...
{ "latitude": "44.044547", "occurrence_date": "2015-02-24 13:41:19", "longitude": "-123.082082", "external_device_id": "ER00E925"}
But instead I still get each input line output as an individual meaasge.
I've tried various combinations of 'regex', 'what', etc.
I sesarched, and used the log4j example, cut'n'paste the data and the Morphline, with same results.
Am I misunderstanding the fundamental operation of 'readMultiLine', doing something trivially stupid, or both?
Created 07-02-2015 10:58 AM
Created on 07-02-2015 11:18 AM - edited 07-02-2015 11:46 AM
Ahh, OK I see.
I had previously tried changing the Spool deserializer from LINE to Blob.
My source is limited to max of 5000 'records', so RAM demands are low.
That enabled me to use Regex to eliminate newlines and aggregate lines into 'records',
but I got errors complaining that I could not create multiple output events from one (blob) input events.
Perhaps I can serialize the single aggregated output blob into individual 'records' at the Sink.
Or, a custom ND JSON deserializer to use with the Spool that emits JSON 'records' the way I want them.
Thank you very much for straightening me out on readMultiLine.
Carter