Created 08-09-2016 04:03 PM
The current workflow is exporting each event.
We are looking to merge all json events based on service/eventname and concatenate time and export them to s3. Our requirement on and merge them using expression language at the runtime.
Created 08-09-2016 05:23 PM
The MergeContent processor can be used to merge JSON together and has a property called "Correlation Attribute Name" which when specified will merge together flow files that have the same value for the attribute specified.
In your scenario you first need to use EvaluateJSONPath to extract "service" and "eventName" from the JSON document. Based on your sample JSON it seems like they are at the root level of the document so I believe something like:
service = $.service eventName = $.eventName
Then you need to get these two values into a single attribute, so you can use UpdateAttribute with something like:
serviceEventName = ${service}/${eventName}
Then in MergeContent set the "Correlation Attribute Name" to "serviceEventName". You can also specify the minimum group size and age so that you can merge together either 100MB or 1 hour worth of data.
Created 08-09-2016 05:23 PM
The MergeContent processor can be used to merge JSON together and has a property called "Correlation Attribute Name" which when specified will merge together flow files that have the same value for the attribute specified.
In your scenario you first need to use EvaluateJSONPath to extract "service" and "eventName" from the JSON document. Based on your sample JSON it seems like they are at the root level of the document so I believe something like:
service = $.service eventName = $.eventName
Then you need to get these two values into a single attribute, so you can use UpdateAttribute with something like:
serviceEventName = ${service}/${eventName}
Then in MergeContent set the "Correlation Attribute Name" to "serviceEventName". You can also specify the minimum group size and age so that you can merge together either 100MB or 1 hour worth of data.
Created 08-10-2016 08:03 AM
@Bryan Bende Thanks for the answer it did work for me. Just a small config iam looking for. Currently when i merge my json events and export them to S3 iam getting concatenated json events delimited by "Space" in a single line. At the moment iam getting concatenated json events in a single line. How can i get the json events delimited by new line \n. Thank you.
Created 08-10-2016 01:19 PM
In MergeContent there is a Delimiter Strategy, choose "Text" which means it uses the values type in to Header, Demarcator, and Footer. The Demarcator is what gets put between each FlowFile that is merged together. You can enter a new line with shift+enter.
Created 08-09-2016 07:45 PM
eventsink/${service_type}/${event_name}/${now():format('yyyy/MM/dd/HHmmssSSS')}.${filename}.json
As you have it above, you are asking for "now()" multiple times would could cause some weirdness if the hour rolls over between invocations, etc. Doing it all with a single call to now() will address this and simplifies the configuration as well.
Created 08-10-2016 07:55 AM
@mpayne thanks for pointing it out 🙂