Currently, we have a setup where we are trying to group together events based on timestamp and split them based on timestamp in order to keep the stack trace error which have newlines in them.
We are currently using the SplitContent to split on:
"(newline)
20"
This is the format of the logs: (this example is supposed to appear single spaced, I hope it does when posted)
2017-07-13 01:00:00,123 Log data here
2017-07-13 01:00:00,124 Log data here
2017-07-13 01:00:00,125 Stack trace error here...
Stack trace error ....
Stack trace error....
Stack trace error.....
2017-07-13 01:00:00,126 Log data here
Using "(newline) 20" allows us to maintain everything between the timestamps as an event, including the stack trace. Oddly enough it will produce in HDFS events with an extra blank line in between each event. (Not usually a big deal but with 2-3 GB files, we are seeing 100+ MB of just space for blank lines) Our current solution is to have ReplaceText processor that will remove all the blank lines but its obviously not optimal.
Any suggestions are welcome. Please see screenshots.


