- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Config log4j in Spark - Driver Logs
- Labels:
-
Apache Spark
Created on ‎06-06-2019 11:50 PM - edited ‎09-16-2022 07:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to use the custom log4j to gather Spark driver logs( submitting jobs under CLUSTER mode), but unable to achieve it.
Here is my custom log4j.properties file content:
log4j.rootCategory=ALL,FILE
log4j.appender.FILE=org.apache.log4j.RollingFileAppender
#Below is the unix server path from where job is getting submitted
log4j.appender.FILE.File=/some/path/to/edgeNode/SparkDriver.log
log4j.appender.FILE.Append=false
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
And cmnd to submit job:
spark2-submit --files /apps/test/config/driver_log4j.properties --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=driver_log4j.properties" --master yarn --deploy-mode cluster --num-executors 2 --executor-cores 4 --driver-memory 1g --executor-memory 16g --keytab XXXXX.keytab --principal XXXXX --class com.test.spark.par_1_submit par_submit.jar
Error I'm getting:
java.io.FileNotFoundException: /some/path/to/edgeNode/SparkDriver.log (No such file or directory)
Created ‎06-07-2019 01:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As your intent seems to capture the driver logs in a separate file while executing the app in the cluster mode, make sure that '/some/path/to/edgeNode/' dir is present on all of the NodeManager essentially as in cluster mode the driver will be running in the Yarn app's application master.
If you can't make sure that follow a general practice to provide log file path to some pre-existing paths e.g. "/var/log/SparkDriver.log".
Created ‎06-07-2019 01:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As your intent seems to capture the driver logs in a separate file while executing the app in the cluster mode, make sure that '/some/path/to/edgeNode/' dir is present on all of the NodeManager essentially as in cluster mode the driver will be running in the Yarn app's application master.
If you can't make sure that follow a general practice to provide log file path to some pre-existing paths e.g. "/var/log/SparkDriver.log".
