Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Config log4j in Spark

avatar

I have read the others threads about this topic but I don't get it to work.

 

I'm using Cloudera 5.4.8 with Spark 1.3.0 and create a log4j.properties 


log4j.rootCategory=DEBUG, RollingAppender, myConsoleAppender
log4j.logger.example.spark=debug

log4j.appender.myConsoleAppender=org.apache.log4j.ConsoleAppender
log4j.appender.myConsoleAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.myConsoleAppender.Target=System.out
log4j.appender.myConsoleAppender.layout.ConversionPattern=%d [%t] %-5p %c - %m%n

log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/opt/centralLogs/log/spark.log
#log4j.appender.RollingAppender.File=/opt/centralLogs/log/spark.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n

 

When I execute my code and I have used the flag --files with the location of the log4j.properties

spark-submit --name "CentralLog" --master yarn-client  --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --jars $SPARK_CLASSPATH --executor-memory 2g  /opt/centralLogs/libProject/produban-paas.jar 

 

I have developed a small code in Scala to Spark where I use log4j. I am tracing some log.error and log.debug. I can see the log.error but not the log.debug. 

I guess that if I use --files I use the same log4j.properties for driver and executor.

 

Does someone have a clue about what it could be wrong?

1 ACCEPTED SOLUTION

avatar
Mentor

> I guess that if I use --files I use the same log4j.properties for driver and executor.

 

Where are you expecting your logs to be visible BTW? At the driver, or within the executors? Since you are using the yarn-client mode, the custom logger passed via --file will be applied only to the executors.

 

If you'd like it applied to the driver also, via just the use of --file, you will need to use the yarn-cluster mode, as so:

 

spark-submit --name "CentralLog" --master yarn-cluster  --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --jars $SPARK_CLASSPATH --executor-memory 2g  /opt/centralLogs/libProject/produban-paas.jar 

 

Otherwise, additonally pass an explicit -Dlog4j.configuration=file:/opt/centralLogs/conf/log4j.properties through spark.driver.extraJavaOptions to make it work, as so:

 

spark-submit --name "CentralLog" --master yarn-client  --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/opt/centralLogs/conf/log4j.properties' --jars $SPARK_CLASSPATH --executor-memory 2g  /opt/centralLogs/libProject/produban-paas.jar 

View solution in original post

5 REPLIES 5

avatar
Mentor

> I guess that if I use --files I use the same log4j.properties for driver and executor.

 

Where are you expecting your logs to be visible BTW? At the driver, or within the executors? Since you are using the yarn-client mode, the custom logger passed via --file will be applied only to the executors.

 

If you'd like it applied to the driver also, via just the use of --file, you will need to use the yarn-cluster mode, as so:

 

spark-submit --name "CentralLog" --master yarn-cluster  --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --jars $SPARK_CLASSPATH --executor-memory 2g  /opt/centralLogs/libProject/produban-paas.jar 

 

Otherwise, additonally pass an explicit -Dlog4j.configuration=file:/opt/centralLogs/conf/log4j.properties through spark.driver.extraJavaOptions to make it work, as so:

 

spark-submit --name "CentralLog" --master yarn-client  --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/opt/centralLogs/conf/log4j.properties' --jars $SPARK_CLASSPATH --executor-memory 2g  /opt/centralLogs/libProject/produban-paas.jar 

avatar
New Contributor

I have 5 spark applications and I want to have 5 different spark application logs. How can this be acheived ?

avatar
Explorer

Hi,

 

I am trying to use the custom log4j to gather Spark driver logs( submitting jobs under CLUSTER mode), but unable to achieve it.

Here is my custom log4j.properties file content:

log4j.rootCategory=ALL,FILE
log4j.appender.FILE=org.apache.log4j.RollingFileAppender

#Below is the unix server path from where job is getting submitted
log4j.appender.FILE.File=/some/path/to/edgeNode/SparkDriver.log

log4j.appender.FILE.Append=false
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

 

And cmnd to submit job:

spark2-submit --files /apps/test/config/driver_log4j.properties --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=driver_log4j.properties" --master yarn --deploy-mode cluster --num-executors 2 --executor-cores 4 --driver-memory 1g --executor-memory 16g --keytab XXXXX.keytab --principal XXXXX --class com.test.spark.par_1_submit par_submit.jar

 

Error I'm getting:

java.io.FileNotFoundException: /some/path/to/edgeNode/SparkDriver.log (No such file or directory)

avatar
New Contributor

The application might be expecting the log folder to be there in order to generate logs in it.

Seems like your problem can be solved by creating the folder in the driver node:

/some/path/to/edgeNode/

 

I hope you also know that you have mentioned the log4j file only for driver program. In order for executors to generate logs you may need to specify the following option in spark-submit
"spark.executor.extraJavaOptions=-Dlog4j.configuration=driver_log4j.properties"

avatar
Contributor

Spark properties control most application parameters and can be set by using a SparkConf object, or through Java system properties.
Environment variables can be used to set per-machine settings, such as the IP address, through the conf/spark-env.sh script on each node.
Logging can be configured through log4j.properties.