Created on 02-27-202302:03 AM - edited on 02-27-202310:54 PM by VidyaSargur
Spark Rolling event log files
1. Introduction
While running a long-running spark application (for example streaming application), the spark will generate a larger/huge single event log file until the Spark application is killed or stopped. Maintaining a single event log file which may cost a lot to maintain and also requires a bunch of resources to replay per each update in the Spark History Server.
To avoid creating. a single huge event log file, the spark team created a rolling event log file.
2. Enabling the Spark Rolling Event logs in CDP
Step1: Enable the rolling event logs and set the max file size
CM -->Spark 3 --> Configuration -->Spark 3 Client Advanced Configuration Snippet (Safety Valve) for spark3-conf/spark-defaults.conf.
By default, spark.history.fs.eventLog.rolling.maxFilesToRetain value will be infinity meaning all event log files are retained. The minimum value is 1.
3. Verify the output
Verify the output from the Spark history server event log directory.