Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Super Collaborator

Spark Rolling event log files

 

1. Introduction

While running a long-running spark application (for example streaming application), the spark will generate a larger/huge single event log file until the Spark application is killed or stopped. Maintaining a single event log file which may cost a lot to maintain and also requires a bunch of resources to replay per each update in the Spark History Server.

 

To avoid creating. a single huge event log file, the spark team created a rolling event log file.

 

2. Enabling the Spark Rolling Event logs in CDP

Step1: Enable the rolling event logs and set the max file size

 

CM -->Spark 3 --> Configuration --> Spark 3 Client Advanced Configuration Snippet (Safety Valve) for spark3-conf/spark-defaults.conf. 

spark.eventLog.rolling.enabled=true
spark.eventLog.rolling.maxFileSize=128m

 The default spark.eventLog.rolling.maxFileSize value will be 128MB. The minimum value is 10MB.

 

Step2: Set the rolling event log max files to retain

 

CM -->Spark 3 --> Configuration -->  History Server Advanced Configuration Snippet (Safety Valve) for spark3-conf/spark-history-server.conf

spark.history.fs.eventLog.rolling.maxFilesToRetain=2

 By default, spark.history.fs.eventLog.rolling.maxFilesToRetain value will be infinity meaning all event log files are retained. The minimum value is 1.

 

3. Verify the output

Verify the output from the Spark history server event log directory.

 

[root@c3543-node4 ~]# sudo -u spark hdfs dfs -ls -R /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002
-rw-rw----   3 spark spark          0 2023-01-04 07:03 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/appstatus_application_1672813574470_0002.inprogress
-rw-rw----   3 spark spark   10485458 2023-01-04 07:05 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_1_application_1672813574470_0002
-rw-rw----   3 spark spark          0 2023-01-04 07:05 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_2_application_1672813574470_0002

[root@c3543-node4 ~]# sudo -u spark hdfs dfs -ls -R /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002
-rw-rw----   3 spark spark          0 2023-01-04 07:03 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/appstatus_application_1672813574470_0002.inprogress
-rw-rw----   3 spark spark     492014 2023-01-04 07:06 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_1_application_1672813574470_0002.compact
-rw-rw----   3 spark spark   10489509 2023-01-04 07:06 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_2_application_1672813574470_0002
-rw-rw----   3 spark spark     227068 2023-01-04 07:06 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_3_application_1672813574470_0002

[root@c3543-node4 ~]# sudo -u spark hdfs dfs -ls -R /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002
-rw-rw----   3 spark spark          0 2023-01-04 07:03 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/appstatus_application_1672813574470_0002.inprogress
-rw-rw----   3 spark spark     873356 2023-01-04 07:06 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_2_application_1672813574470_0002.compact
-rw-rw----   3 spark spark   10484816 2023-01-04 07:06 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_3_application_1672813574470_0002
-rw-rw----   3 spark spark     339165 2023-01-04 07:06 /user/spark/spark3ApplicationHistory/eventlog_v2_application_1672813574470_0002/events_4_application_1672813574470_0002

 

References:

 

 

1,331 Views
0 Kudos