Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Monitoring Apache Spark Logs and the Dynamic App/Driver logs

Highlighted

Monitoring Apache Spark Logs and the Dynamic App/Driver logs

Explorer

We are running a spark-streaming application on a standalone setup (version 1.6).

The logging in spark seems to be a bit scattered and I am attempting to configure a nagios log file monitor that checks for certain "errors" in log files and sends out alerts.

 

My current understanding in regards to logs for spark are the following:

 

  1. Spark-Worker has it's own logs and in my case it's written to a static location /var/log/spark/spark-worker.out
  2. Spark-Master has it's own logs and in my case it's written to a static location /var/log/spark/spark-master.out
  3. I can configure the log4j.properties file under /etc/spark/conf/ to alter format, appenders etc.. for spark-worker and spark-master logs

Now for Driver and Spark/Executor App logs It seems the location for these logs are dynamic and spark will generated new directories under /var/run/spark/work in my case.

 

My issue:

Monitoring the static location log files is straight forward for spark-worker and spark-master. I am a bit confused as to how the dynamic logs for app and drivers can be monitored.

From what I read in the documentation, it seems upon spark-submit I can pass a -D option with a location to a log4j.properties file.

 

Can this be configured to stream the logs to a local syslog in a static location and then have nagios monitor that static log?

 

What have others done in this case?