Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Spark application logs in cluster mode

New Contributor

Hi All,

I have a spark application running in YARN cluster mode: spark-submit --master yarn --deploy-mode cluster ...


The Spark logs (driver  and executor) are stored on HDFS (/user/spark/driverLogs) and available via Cloudera Web UI (Cloudera Web UI -> Clusters -> YARN -> Applications tab -> click on application ID -> Logs) as soon as Spark application is running. 


However,  when application is completed/failed the logs removed from HDFS /user/spark/driverLogs folder and thus not available via Cloudera Web UI.


When running application in YARN client mode (spark-submit --master yarn --deploy-mode client...), logs are kept after application is completed.


spark-defaults is attached.

Any ideas about the root cause?




Cloudera Employee

The location of Spark application history logs in HDFS is configured under spark.eventLog.dir which is set as /user/spark/applicationHistory. So your required logs can be found under this path.


The /user/spark/driverLogs path contains only the spark driver logs in HDFS when Spark application runs in client mode.


Please refer the below link for more details :