Created on 10-25-2021 06:16 AM - edited 10-25-2021 06:23 AM
Hi everyone,
I tried to run a spark job on Cloudera by overwriting some basics spark configurations. So I created a spark job with the following Configurations (optional):
I have executed this job by running a cde cli command. The submitter keeps my overwriting values for these 3 Spark configurations but these values weren't retrieve in my Spark driver. For instance, I found a "default" tmpPath for spark.eventLog.dir instead of myPath and my driver used a client deployMode.
Some other configurations (like spark.history.fs.logDirectory) were well overwriting at the same time.
Have you ever met this problem?
Thank you 🙂
Created on 10-25-2021 10:59 PM - edited 10-25-2021 11:04 PM
Spark configuration parameters precedence (left is low and right is high) of the order is:
spark-defaults.conf --> spark-submit/spark-shell --> spark code (scala/java/python)
If you want to see the parameter values you can run with --verbose mode.
spark-submit --verbose
Please recheck the spark-submit command and parameters once again.
--conf spark.eventLog.enabled=true
--conf spark.eventLog.dir=<directory>
--conf spark.submit.deployMode=cluster
Created 10-28-2021 11:34 PM
@SimonBergerard, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,