I am working on ingesting CloudWatch log files into my environment. We would like to avoid splitting these files. As 5 flowfiles becomes over 1000 and then over 30000. We get hundreds of these logs a day so splitting and merging is NOT scalable.
This is what the data looks like as it arrives as a flowfile:
Note there may be one 'DATA_MESSAGE' or there may be thousands. In like manner there may be one logEvents or there may be thousands of logEvents per DATA_MESSAGE