Since Ranger 0.5 there has been the ability to summarize audit events that differ only by timestamp to reduce the amount of events logged in a busy system. When enabled, if a Ranger plugin logs consecutive audit events that differ only by timestamp it will coalesce all such events in to a single event and set 'event_count' to the number of events logged and 'event_dur_ms' to the time difference in milliseconds between the first and last event.
To enable this feature you must set the following properties in the Ranger plugin's configuration:
To enable summarization set this property to true. This would cause audit messages to be summarized before they are sent to various sinks.
By default it is set to false i.e. audit summarization is disabled.
If unspecified this value defaults to 1048576, i.e. the queue is sized to store 1M (1024 * 1024) messages.
Note the difference in property name that controls the size of summary queue.
The max time interval at which messages would be summarized.
If unspecified it defaults to 5000, i.e. 5 seconds.
Summarization Batch size
Note that regardless of this time interval while summarizing at most 100k messages at a time are considered for aggregation. Thus, if more than 100k messages are logged during this interval then similar messages could show up as multiple summarized audit messages even though they are logged within the configured time interval.
Currently, this value of 100k is not user configurable. It is mentioned here for better understanding of Summarization logic.