Support Questions

GrazittiAPI · ‎10-25-2021

Hi everyone,

I tried to run a spark job on Cloudera by overwriting some basics spark configurations. So I created a spark job with the following Configurations (optional):

spark.eventLog.dir = myPath
spark.eventLog.enabled = true
spark.submit.deployMode = cluster

I have executed this job by running a cde cli command. The submitter keeps my overwriting values for these 3 Spark configurations but these values weren't retrieve in my Spark driver. For instance, I found a "default" tmpPath for spark.eventLog.dir instead of myPath and my driver used a client deployMode.

Some other configurations (like spark.history.fs.logDirectory) were well overwriting at the same time.

Have you ever met this problem?

Thank you 🙂

RangaReddy · ‎10-25-2021

Hi @SimonBergerard

Spark configuration parameters precedence (left is low and right is high) of the order is:

spark-defaults.conf --> spark-submit/spark-shell --> spark code (scala/java/python)

If you want to see the parameter values you can run with --verbose mode.

spark-submit --verbose

Please recheck the spark-submit command and parameters once again.

--conf spark.eventLog.enabled=true

--conf spark.eventLog.dir=<directory>

--conf spark.submit.deployMode=cluster

VidyaSargur · ‎10-28-2021

@SimonBergerard, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

Regards,

Vidya Sargur,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

Support Questions

My SparkConfigurations are not overwriting in my Spark driver