Support Questions

Find answers, ask questions, and share your expertise

log4j.properties override spark executor

avatar
New Contributor

In a Spark application I want to configure the log levels for just my own packages on both the driver and executors. Can this be done without copying the entire spark-client/conf and spark-client/bin to your project and make modifications on various places? Yes, for the driver it is easy because of the --driver-java-options switch in spark-submit, but for the executors a similar switch is lacking.

1 ACCEPTED SOLUTION

avatar
Super Guru

Hi @MarcdL

Did you tried with --conf "spark.executor.extraJavaOptions?

View solution in original post

7 REPLIES 7

avatar
Super Guru

Hi @MarcdL

Did you tried with --conf "spark.executor.extraJavaOptions?

avatar
New Contributor

Hi Jitendra,

I believe I did, but that I stranded in the fact the -Dlog4j.configuration needs an absolute path, while you can merely provide the log4j,properties with --files and provide a relative path as ./log4j.properties on the executor.

Best wishes, Marc

avatar
Super Guru

@MarcdL Can you please set that property explicitly either in code or in spark command line?

avatar
New Contributor

Hi Jitendra,

You are right, it works this way!

The relevant lines in my spark-submit run script are:

--files external/log4j.properties \

--conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" \

--driver-java-options "-Dlog4j.configuration=file:/absolute/path/to/your/project/externall/log4j.properties" \

The result is still confusing because yarn keeps pushing INFO messages even if your project's rootlogger is set to WARN. But this setup does allow to set the logging level for your own project's packages.

Thanks for pushing me into this final step!

Marc

avatar
Contributor

You can just use sc.setLogLevel("ERROR") in your code to suppress log information without changing the log4j.properties file.

avatar
New Contributor

--driver-java-options "-Dlog4j.debug=true -Dlog4j.configuration=file://${BASEDIR}/src/main/resources/log4j.properties" \

--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///log4j.properties" \

--files "{${BASEDIR}/src/main/resources}" \

So, for the driver I use the driver-java-options and for the executer I combine the conf and the files.

avatar

I tried to follow the same steps mentioned above to override the log4j properties but still not working. My problem is I am running kafka streaming job in yarn cluster mode and when I go and see logs after one hour in web UI,the logs are grown big. I like to know the steps where I can write logs in local file system/hdfs so I would go and see the logs in unix terminal instead of using web UI.