Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark application logs in cluster mode

avatar
New Contributor

Hi All,

I have a spark application running in YARN cluster mode: spark-submit --master yarn --deploy-mode cluster ...

 

The Spark logs (driver  and executor) are stored on HDFS (/user/spark/driverLogs) and available via Cloudera Web UI (Cloudera Web UI -> Clusters -> YARN -> Applications tab -> click on application ID -> Logs) as soon as Spark application is running. 

 

However,  when application is completed/failed the logs removed from HDFS /user/spark/driverLogs folder and thus not available via Cloudera Web UI.

 

When running application in YARN client mode (spark-submit --master yarn --deploy-mode client...), logs are kept after application is completed.

 

spark-defaults is attached.

Any ideas about the root cause?

 

 

1 REPLY 1

avatar
Cloudera Employee

The location of Spark application history logs in HDFS is configured under spark.eventLog.dir which is set as /user/spark/applicationHistory. So your required logs can be found under this path.

 

The /user/spark/driverLogs path contains only the spark driver logs in HDFS when Spark application runs in client mode.

 

Please refer the below link for more details : https://docs.cloudera.com/documentation/enterprise/6/properties/6.3/topics/cm_props_cdh630_spark.htm...