Support Questions

Find answers, ask questions, and share your expertise

Spark application logs in cluster mode

avatar
New Contributor

Hi All,

I have a spark application running in YARN cluster mode: spark-submit --master yarn --deploy-mode cluster ...

 

The Spark logs (driver  and executor) are stored on HDFS (/user/spark/driverLogs) and available via Cloudera Web UI (Cloudera Web UI -> Clusters -> YARN -> Applications tab -> click on application ID -> Logs) as soon as Spark application is running. 

 

However,  when application is completed/failed the logs removed from HDFS /user/spark/driverLogs folder and thus not available via Cloudera Web UI.

 

When running application in YARN client mode (spark-submit --master yarn --deploy-mode client...), logs are kept after application is completed.

 

spark-defaults is attached.

Any ideas about the root cause?

 

 

2 REPLIES 2

avatar
Contributor

The location of Spark application history logs in HDFS is configured under spark.eventLog.dir which is set as /user/spark/applicationHistory. So your required logs can be found under this path.

 

The /user/spark/driverLogs path contains only the spark driver logs in HDFS when Spark application runs in client mode.

 

Please refer the below link for more details : https://docs.cloudera.com/documentation/enterprise/6/properties/6.3/topics/cm_props_cdh630_spark.htm...

avatar
New Contributor