Hi All,
I have a spark application running in YARN cluster mode: spark-submit --master yarn --deploy-mode cluster ...
The Spark logs (driver and executor) are stored on HDFS (/user/spark/driverLogs) and available via Cloudera Web UI (Cloudera Web UI -> Clusters -> YARN -> Applications tab -> click on application ID -> Logs) as soon as Spark application is running.
However, when application is completed/failed the logs removed from HDFS /user/spark/driverLogs folder and thus not available via Cloudera Web UI.
When running application in YARN client mode (spark-submit --master yarn --deploy-mode client...), logs are kept after application is completed.
spark-defaults is attached.
Any ideas about the root cause?